Introduction

Birth weight (BW) is an important predictive parameter for neonatal morbidity and mortality and it has a strong influence on obstetric and neonatal management. Delivery of a macrosomic fetus, for example, is associated with several peripartum complications like a prolonged second stage of labor and serious maternal or fetal trauma [13]. On the other hand, fetuses with an estimated weight of less than 10 % for gestational age have higher rates of neonatal mortality and morbidity than normal BW infants and are at greater risk for neurologic and developmental deficits during childhood [47].

During the past 40 years, sonographic assessment of the fetus and estimation of its weight have become part of routine practice in obstetrics. Several formulas have been published, most of them involving combinations of several biometric parameters [812]. Most of these formulas were developed and evaluated using sonograms performed with a scan-to-delivery interval of up to 14 days [9, 10, 1319].

However, in many cases, the fetal weight estimation (WE) is only performed intrapartum when the patient presents during the latent or active phase of labor. Although this approach eliminates the potential effect of interim fetal growth, it is accompanied by several other problems: first, as delivery approaches, the fetal head descends into the maternal pelvis leading to inaccurate measurements of the biparietal diameter (BPD), occipitofrontal diameter (OFD) and head circumference (HC). Second, an increased likelihood of abdominal circumference distortion or of posterior position of the femora is more frequently observed at this late gestation and finally, decreasing amounts of amniotic fluid at term could limit the accuracy of all measurements.

To date, only few studies have evaluated fetal WE close to delivery [2024], the results of which proved to be inconsistent. While in some studies the accuracy of clinical WE was at least equal to that of sonographic WE [20, 21], in other studies, estimation by ultrasound performed better than clinical WE [22].

However, most of these studies were limited due to small sample sizes. Furthermore, in the majority of cases, only one or two different weight estimation formulas were tested. Therefore, the aim of the present study was to evaluate the accuracy of intrapartum sonographic WE in a large cohort of more than 1900 women using five formulas with different combinations of biometric parameters.

Materials and methods

This retrospective, cross-sectional study included 1958 singleton pregnancies at our perinatal center between 2003 and 2009. Inclusion criteria were singleton pregnancy with cephalic presentation, vaginal delivery and ultrasound examination with complete biometric parameters—BPD, OFD, HC[HC = 2.325 × ((OFD)^2 + (BPD)^2)^1/2], abdominal transverse diameter (ATD), abdominal anterior–posterior diameter (APAD), AC[AC = π × (ATD + APD)/2], and femur length (FL) performed on the day of delivery during the latent or active phase of labor, and the absence of chromosomal or structural anomalies. Cases with intrauterine fetal death were excluded. We also excluded secondary cesarean sections from the study group, as there are several indications for a secondary cesarean section in women who did not have any contractions before the cesarean section is performed (for example, all women presenting with a PROM and receiving a cesarean section for other reasons, e.g., on maternal request, are classified as secondary).

As it is clinical routine in our institution, sonographic WE was performed in all women who presented for delivery and had not received a sonographic WE at our clinic within the last 14 days.

The accuracy of intrapartum WE was compared to a control group of fetuses delivered by primary cesarean section at our perinatal center between 2003 and 2009 and an ultrasound examination with complete biometric parameters performed within 3 days before delivery (n = 392). Otherwise, the same inclusion criteria as in the study group were applied. Primary cesarean section was scheduled for typical obstetric indications (e.g., a history of more than one cesarean section, abnormal placentation, suspected fetal macrosomia).

These patients were chosen as a control group, as they presented without contractions or premature rupture of the membranes and received their ultrasound scan also very close to delivery (to avoid any bias of a possible time effect). Before comparing the measures of accuracy between the study sample and the control group, pairwise nearest neighbour matching was performed based on the actual BW of the fetuses.

The examinations were performed by a total of 43 different examiners. All measurements (control and intrapartum group) took place in the delivery unit by the same group of examiners. The majority of the scans were performed by residents in their 2nd or 3rd year of clinical training with a moderate grade of ultrasound experience.

Gestational age was calculated from the last menstrual period and was confirmed by or recalculated with biometric measurements obtained from the first fetal biometry in early pregnancy (in accordance with the recommendations of the American College of Obstetricians and Gynecologists, ACOG) [25]. The examinations were performed in accordance with the widely accepted quality standards [26, 27]. Birth weight and neonatal length were measured within 1 h after delivery by the nursing staff. For fetal WE, five widely used formulas of Hadlock et al. [10, 15] were employed, including different combinations of the biometric parameters HC, BPD, AC and FL (Table 1). Measurements are given in centimeters and BW in grams. At our hospital, fetal weight is routinely measured by ultrasound examination during the diagnostic work-up. Ethical approval for the study was, therefore, not sought.

Table 1 Regression models for fetal weight estimation

Accuracy of the estimated fetal weight (EFW) was assessed by calculating (1) the percentage error (PE): (EFW − BW)/BW × 100, the mean of which reflects the systematic deviation of a model from the actual BW; (2) the random error (standard deviation of the PE)—a measure of precision that reflects the random component of the prediction error; (3) the absolute percentage error (APE): |(EFW − BW)/BW| × 100, which takes both the systematic and the random error into account; and (4) the percentage of fetal WEs falling within a 10 % range of the actual BW.

To test for systematic bias, the means of percentage errors (MPE) for all equations were compared to zero using one-sample t tests.

Due to the dependencies resulting from matching, differences between the matched samples were assessed applying test procedures for paired samples. Comparisons are hence based on paired t tests (MPE), the Snedecor and Cochran [28] method (random error), Wilcoxon signed rank tests (median of APE; MAPE) and McNemar tests (APE ≤ 10 %).

Results

The demographic and obstetric characteristics of the women and fetuses in the different study groups are presented in Table 2. Of the 1958 vaginal deliveries, 93.2 % were spontaneous and in 6.8 %, assisted vaginal deliveries were performed (5.6 % vacuum; 1.2 % forceps). Within the group of vaginal deliveries, membranes were ruptured at the time of the ultrasound scan in 36.2 % (intact membranes 52.3 %; missing data 11.4 %). 66.6 % of the women were in the latent phase of labor, whereas 23.7 % were in the active phase when the ultrasound examinations were performed (missing data 9.7 %). In the latter group, the median cervical dilatation was 5 cm.

Table 2 Principal clinical parameters of the whole study group (n = 1958) and of matched pairs from the study sample (intrapartum, n = 392) and control group (primary cesarean section, n = 392)

For intrapartum WE in the whole study group, all equations showed a systematic underestimation of fetal weight (negative MPEs). A significant bias was found for all MPE values when compared to zero (p < 0.0001). Overall MPE values were closest to zero with the Hadlock II formula, using BPD, AC and FL as biometric parameters (Hadlock II, MPE: −1.28). The largest systematic error was found with the Hadlock III formula, including HC, AC and FL (Hadlock III, MPE: −5.95) (Table 3).

Table 3 Measures of accuracy in intrapartum fetal weight estimations performed in the whole study group (n = 1958)

Regarding MAPEs, the best results were achieved with the formula including BPD, AC and FL (Hadlock II, MAPE: 6.52). Using the AC as the only biometric parameter (Hadlock V) resulted in the largest MAPE (Hadlock V, MAPE 8.49) (Table 3). Similar results were found for the percentage of WEs within 10 % of the actual BW. Again, the Hadlock II formula yielded the best results (Hadlock II: 68.28 %), whereas the Hadlock V equation showed the lowest values (Hadlock V: 57.81 %) (Table 3).

The random errors are also presented in Table 3. Values were ranging from 9.56 for the Hadlock III formula (HC, AC, FL) to 12.01 with the Hadlock V equation (AC).

No significant differences were shown between women with ruptured versus intact membranes at the time of the ultrasound scan regarding all measures of accuracy (data not shown). Furthermore, no influence on the accuracy of sonographic WE was found for the stage of labor: similar results were found in the latent versus active phase of labor for all measures of accuracy (data not shown). However, there was a slight trend towards more negative MPEs in the group with ruptured membranes or active labor (data not shown).

In the control group, a total of 392 fetuses, delivered by primary cesarean section, were included. To compare the accuracy of WE between the two groups, fetuses were matched pairwise regarding the actual BW. No relevant differences were found for the other demographic and obstetric characteristics of the fetuses and women between the two matched groups (Table 2).

MPEs differed significantly between WE in the study and control group for all evaluated formulas (Table 4): in the study group, a significant underestimation (negative MPEs, p < 0.0001, data not shown) was shown for all equations; whereas in the control group, either no systematic error (Hadlock III, IV and V) or a significant overestimation (Hadlock I, II) was found (Fig. 1).

Table 4 Comparison of the accuracy of fetal weight estimations in matched pairs from the study sample (intrapartum, n = 392) and control group (primary cesarean section, n = 392)
Fig. 1
figure 1

Box plots showing the distribution of signed percentage errors obtained with the different formulas in study and control group

Regarding MAPEs (Fig. 2; Table 4), application of the Hadlock III (HC, AC, FL) and V (AC) formula resulted in significant lower values in the control group (Hadlock III, MAPE: 7.48 vs. 5.95, p = 0.0008 and Hadlock V, MAPE: 8.79 vs. 7.52, p = 0.0085). No significant differences were found for the other equations. Analog results were shown for the percentage of WEs within 10 % of the actual BW (Hadlock III, 71.94 vs. 59.69 %, p = 0.0007 and Hadlock V, 64.03 vs. 55.10 %, p = 0.0159).

Fig. 2
figure 2

Box plots showing the distribution of absolute percentage errors obtained with the different formulas in study and control group

No significant differences between the two groups were shown regarding the random errors. Values ranged from 9.32 to 11.86 in the control group and from 9.49 to 12.24 in the study group (Table 4).

Discussion

In this study, intrapartum sonographic WE was evaluated in a large group of more than 1900 women and compared to a control group of patients, presenting for primary cesarean section without any contractions. For all formulas, intrapartum WE showed significant negative MPEs. Values ranged from −6.7 to −2.2 in the matched study group (−6.0 to −1.3 in the complete sample), and from −0.6 to 3.2 in the control group.

Similar results were shown in a study performed by Peregrine et al. [24]: WE with one of the Hadlock formulas, including AC and FL, was performed in 262 women at term prior to induction of labor. The authors found a significant underestimation with a MPE value of −7.6.

On the contrary, Melamed et al. [29] analyzed 3672 “standard” WEs within 3 days prior to delivery. The authors evaluated the formulas of Hadlock et al. and MPEs showed a significant overestimation with most of the equations (values ranged from −1.2 to +6.0). These differences could be explained by the fact that in the regarding study, women were included within 3 days prior to delivery irrespective of the delivery mode, the stage of labor or the presence of contractions.

The MPE reflects the systematic deviation of a model from the actual BW: negative values, therefore, represent a systematic underestimation; whereas positive values indicate a systematic overestimation of fetal BW.

The observed intrapartum underestimation can be partly explained by an increasing descent of the fetal head into the maternal pelvis leading to inaccurate small measurements of the BPD, OFD and HC. The formulas not including any fetal head measurement (Hadlock IV and V), however, also showed a significant underestimation. An explanation for this result may be found in an increased risk of abdominal circumference distortion or unfavorable position of the femora during the latent or active phase of labor.

The standard deviation of the MPEs demonstrates the random error of WEs. Values of the whole study group ranged from 9.6 for the Hadlock III formula (HC, AC, FL) to 12.1 with the Hadlock V equation (AC). No significant differences between study and control groups were apparent.

In the study of Peregrine et al. [24], similar values were found. WE with one of the Hadlock formulas (AC, FL) resulted in a random error of 10.6. Melamed et al. [29] found in their study random errors ranging between 13.6 (Hadlock, FL) and 8.1 (Hadlock HC, AC and FL).

The MAPE takes systematic and random errors into account. While in three of the five evaluated formulas no differences regarding MAPEs were found between the two groups, the equation using the AC as a single parameter (Hadlock V) and the formula including HC, AC and FL (Hadlock III) showed better accuracy in the control group. Values of the whole study group ranged from 6.5 for the Hadlock II formula (BPD, AC, FL) to 8.5 with the Hadlock V equation (AC).

In a study of Siemer et al. [14], “standard” WE with 11 different formulas was evaluated in a group of 1941 pregnancies. Each fetus underwent ultrasound examination with complete biometric parameters within 7 days before delivery. Patients were included irrespective of the delivery mode or the stage of labor. In comparison with our results, larger MAPEs were shown: for the Hadlock formulas MAPEs ranged between 8.14 (BPD, AC, FL) and 9.55 (AC).

Basha et al. [17] analyzed “standard” WEs performed in 415 fetuses within 14 days before delivery with one of the Hadlock formulas (HC, AC, FL). They found a MAPE of 8.2 for WE performed within 8–14 days before delivery. Again, patients were included irrespective of the delivery mode or the stage of labor.

The larger MAPE values in the latter two studies might partly be explained by the increased scan-to-delivery interval of up to 14 days: potential problems accompanying intrapartum WE might be outweighed by the interim fetal growth.

In our study, the Hadlock formula using the BPD as the only head measurement (Hadlock II) showed the lowest MAPE and MPE; whereas both formulas, also including the HC (Hadlock I and III), yielded a lower accuracy. This might be explained by the fact that with an increasing descent of the fetal head, the BPD measurement can be easier and more accurate achieved in comparison with the measurement of the HC. Similar results were shown by Schmidt et al. [30]: the authors analyzed the influence of the accuracy of fetal head measurements on sonographic WE. Accurate results were found when the BPD was used as the only head measurement.

The percentage of WEs within 10 % of BW ranged between 58 % (Haldlock V, AC) and 68 % (Hadlock II, BPD, AC, FL). These results are in analogy with those of other studies, analyzing both intrapartum and “standard” WE performed irrespective of the delivery mode or the stage of labor.

Noumi et al. [23] analyzed the accuracy of sonographic and clinical WE performed during the active phase of labor in 192 patients. The percentage of WEs within ±10 % of BW was 74 % with one of the Hadlock formulas using BPD, AC and FL as biometric parameters.

Similar results were found by Farrell et al. [22]: sonographic WE was performed prior to induction of labor in 96 women with one of the Hadlock (HC, AC, FL) formulas. 72 % of the ultrasound estimates were within 10 % of the BW. In another study, “standard” sonographic WE was performed within 7 days before delivery in 1717 patients, irrespective of the delivery mode or the stage of labor. The authors found 68.7 % of the estimates within 10 % of BW [31].

The strength of our study is the large sample size of the study group in which the intrapartum sonographic WE was evaluated. However, there are also some limitations: due to the retrospective study design, it was not able to clarify in which percentage of the ultrasound scans the optimal views could not be achieved (e.g., because of a progressive descend of the fetal head).

In conclusion, significant more negative MPEs have to be taken into account in sonographic WE performed intrapartum during the active or latent phase of labor, irrespective of the applied WE formula. As the MPE reflects the systematic deviation of a model from the actual BW, a significant underestimation of fetal weight has, therefore, to be expected in intrapartum WE. In clinical practice, this has to be taken into account when counseling women or determining the clinical procedure during labor.

Overall, the best results regarding intrapartum WE can be achieved with formulas using the BPD as the only head measurement.