Introduction

Accurate fetal weight estimation (WE) is of great importance as abnormalities of intrapartum growth are associated with several peripartum complications [13]. The counseling of patients with fetal macrosomia, breech presentation or preterm labor for example is significantly influenced by estimated fetal weight.

During the past 40 years, sonographic assessment of the fetus and estimation of its weight have become part of routine practice in obstetrics. Several formulas have been published, most of them involving combinations of several biometric parameters [47]. So far, most of the studies assessing the accuracy of sonographic weight estimation analyzed sonograms that were performed close to term within 7 days before delivery [814]. However, in numerous patients fetal WE has either been performed before this time period or is conducted when the patient presents during the latent or active phase of labor. While in the first case interim fetal growth may lead to a potential underestimation, WE performed intrapartum may be accompanied by several other problems: First, as delivery approaches, the fetal head descends into the maternal pelvis leading to inaccurate measurements of the biparietal diameter (BPD), occipitofrontal diameter (OFD) and head circumference (HC) [15, 16]. Second, an increased risk of abdominal circumference distortion or posterior position of the femora is more frequently observed close to delivery and finally, decreasing amounts of amniotic fluid at term could limit the accuracy of all measurements [1618].

To date only a few studies have evaluated the influence of the time interval between examination and delivery on the accuracy of fetal WE [1922]. However, none of these studies evaluated this topic systematically in detail using large sample sizes. The aim of the present study was therefore to compare the accuracy of sonographic fetal WE between eight time interval groups ranging from an estimation on the day of delivery to a maximum of 6 weeks analysing more than 12,000 biometric measurements.

Materials and methods

The retrospective, cross-sectional study included 8723 singleton pregnancies with a total of 12,420 sonographic weight estimations at our university perinatal center between 2003 and 2009. The inclusion criteria were a singleton pregnancy with cephalic presentation and an ultrasound examination with complete biometric parameters—BPD, OFD, HC(HC = 2.325 × ((OFD)2 + (BPD)2)1/2), abdominal transverse diameter (ATD), abdominal anterior–posterior diameter (APAD), AC(AC = π × (ATD + APD)/2), and femur length (FL) within a time interval of 0–42 days before delivery; and an absence of any chromosomal or structural anomalies. Intrauterine fetal deaths were excluded. In our institution a sonographic weight estimation is routinely performed in all patients registering for delivery (these registrations are usually performed between 32 and 38 weeks of gestation). Thus, in the vast majority of cases, this was the indication for the examination.

Fetuses were divided into eight groups with regard to the time interval between estimation and delivery—in cases of repeated measurements within one group only one examination was randomly selected. In our opinion the time intervals that are closest to delivery are, from a clinical point of view the most interesting ones, as most of the scans are usually performed relatively close to delivery. Therefore, we divided this time period in shorter intervals to receive more detailed information.

  • Group 1: Weight estimation on the day of delivery n = 2610

  • Group 2: Time interval between estimation and delivery 1–3 days, n = 3764

  • Group 3: Time interval between estimation and delivery 4–7 days, n = 1409

  • Group 4: Time interval between estimation and delivery 8–14 days n = 1460

  • Group 5: Time interval between estimation and delivery 15–21 days n = 1087

  • Group 6: Time interval between estimation and delivery 22–28 days n = 846

  • Group 7: Time interval between estimation and delivery 29–35 days n = 695

  • Group 8 Time interval between estimation and delivery 36–42 days n = 549

Gestational age was calculated from the last menstrual period and was confirmed by or recalculated with biometric measurements obtained from the first ultrasonography performed during the pregnancy (in accordance with the recommendations of the American College of Obstetricians and Gynecologists, ACOG) [23]. The examinations were performed in accordance with widely accepted quality standards [24, 25]. Birth weight (BW) and neonatal length were measured within 1 h after delivery by the nursing staff. For estimation of fetal weight five widespread formulas (Hadlock et al., Shepard et al., Merz et al., Warsof et al.) were used, including the biometric parameters HC, BPD, AC, and FL (formulas listed in Table S1) [5, 6, 8, 26]. In the present study, formulas not including any head measurements were not evaluated, as it could be shown in one of our previous studies that these formulas were less accurate [15]. Measurements are given in centimeters and BW in grams. In the department in which the study was conducted, fetal weight is routinely measured by ultrasound examination during the diagnostic work-up. Ethical approval for the study was therefore not sought.

The accuracy of the estimated fetal weight (EFW) was assessed by calculating: (1) the percentage error (PE): (EFW − BW)/BW × 100, the mean of which reflects the systematic deviation of a model from the actual BW; (2) the random error (standard deviation of the PE)—a measure of precision that reflects the random component of the prediction error; (3) the absolute percentage error (APE):|(EFW − BW)/BW| × 100, which takes both the systematic and the random error into account; and (4) the percentage of fetal WEs falling within a prespecified error bound (APE < 10 %).

Mean percentage errors (MPEs) for all formulas and groups were compared to zero using one-sample t tests. To compare the prediction accuracy between the interval groups (which may hence contain also different measurements from a single fetus in different groups), generalized mixed effect regression analysis with subject specific random effects was performed to adjust for those repeated measurements, as classical test assumptions (e.g., for two-sample t tests) of independent observations are not fulfilled. These models were fitted separately for the various formulas with the different accuracy measures as outcome variables (i.e., PE, APE and APE ≤ 10 %) and the interval group as categorical predictor variable to get adjusted effect estimates for the group differences. Overall differences were assessed via Likelihood-ratio tests, group specific differences toward measurements at delivery were assessed via Wald-tests for the effect estimates with group 1 as reference.

Results

The demographic and obstetric characteristics of the women and fetuses in the different groups are presented in Table 1. There were no relevant differences between the groups regarding maternal BMI, maternal age, median gestational age at delivery and median BW.

Table 1 Principal clinical parameters in the different time interval groups

Significant overall differences were found between the time interval groups with regard to MAPE and MPE values for all considered formulas (p < 0.001, data not shown).

In group one, the Hadlock I and Warsof formula showed a systematic underestimation of fetal weight (negative MPEs, p < 0.001). No relevant systematic error was found with the Hadlock II formula (p = 0.16). The equations of Merz and Shepard showed a systematic overestimation (positive MPEs, p < 0.001) (Table 2).

Table 2 Comparison of the mean percentage errors (MPE) and random errors (RE) of fetal weight estimations in the different time interval groups

MPE values of the Hadlock I, II and Warsof formulas were closest to zero in WEs with a time interval of 1–3 days (group two) (Table 2). From group three to six, MPE values decreased continuously with significant differences in comparison with group one (Table 2). For the Hadlock II formula MPEs ranged from −0.41 in group 1 to −5.83 in group 4, −17.65 in group 6 and −30.56 in group 8 (p < 0.001).With the Merz and Shepard equations MPEs were closest to zero in group four. Overall, the best results were achieved with the Merz formula in group four (group four MPE: 0.31) and the Hadlock II equation in group two (group two MPE: 0.35) (Table 2).

The random errors of WEs in the different groups are presented in Table 2. The Hadlock I, II, Warsof and Shepard formula yielded the largest errors in WEs of group one and two. Within the other time interval groups no major differences were found. Overall, the largest error was found with the Merz equation in group eight (group eight RE: 23.27).

Using the Hadlock I, II and Warsof formulas, the lowest MAPE values were found for WEs within 1 week before delivery (group one, two and three) (Table 3). From group four to six values increased continuously, with significant differences in comparison with group one. For the Hadlock II formula MAPEs ranged from 6.78 in group 1 to 8.21 in group 4, 18.11 in group 6 and 30.23 in group 8 (p < 0.001). With the Shepard and Merz formula the lowest MAPE was found in group three (group three MAPE: 7.47) and five (group five MAPE: 6.91), respectively (Table 3). Overall, best results were achieved with the Hadlock formulas in group two (group two MAPE: 6.68 Hadlock I and group two MAPE: 6.62 Hadlock II) (Table 3).

Table 3 Comparison of median absolute percentage errors (MAPE) and percentages of weight estimations within a 10 % range (≤10 %) between the different time interval groups

Similar results were found regarding the distribution of WEs within prespecified error bounds (Table 3). Again, the best results were achieved with the Hadlock equations in WEs performed within 1–3 days before delivery: in group two 68.76 % (Hadlock I) and 68.73 % (Hadlock II) of fetal WEs were falling within the 10 % range of the actual BW (Table 3). For the Hadlock II formula WEs within the 10 % range of the actual birth weight decreased from 67.20 in group 1–58.84 in group 4, 19.98 in group 6 and 2.00 in group 8 (p < 0.001).

As fetal growth abnormalities might potentially influence the accuracy of sonographic WE with regard to the scan-to-delivery interval, we additionally analyzed the accuracy of the different formulas for the interval groups on a reduced sample with 11.056 measurements were all fetuses with a birth weight below the 10th percentile (SGA fetuses) were excluded (Data not shown). When comparing the results, slightly lower MPEs could be observed in the study group without SGA fetuses for all formulas. However, these differences did not influence the general conclusions of this study.

Discussion

In this study the influence of the time interval between estimation and delivery on the accuracy of sonographic fetal WE was evaluated and compared between eight time interval groups ranging from an estimation on the day of delivery to a maximum of 6 weeks. While the percentage errors for all formulas show a clear tendency towards smaller estimated BW for increasing scan-to-delivery time interval (Fig.), the impact of this finding differed significantly depending on the systematic error of the WE formula: Formulas with a general underestimation yielded the best results in WEs that were performed within 1 week before delivery. Equations with a systematic overestimation of fetal weight showed the highest accuracy in WEs with a scan-to-delivery interval of up to 3 weeks.

Similar results were shown in a study performed by Heer et al. [20]. The authors analyzed 820 singleton pregnancies and evaluated 9 different factors that potentially influence the precision of sonographic weight estimation. Of the 9 evaluated factors only a short interval between sonographic weight estimation and delivery (0–7 vs. 8–14 days) had a statistically significant impact. MAPE, achieved with one of the Hadlock formulas was 8.52 in WEs performed within 1 week before delivery and increased to 10.37 in WEs with an interval of 8 to 14 days. In the present study MAPE values of WEs performed within 1 week before delivery using the Hadlock II formula showed a better accuracy with values ranging between 6.62 and 6.78. Within a scan-to-delivery interval of 7 to 14 days MAPEs increased to 8.21.

In a systematic review performed by Chauhan et al. 175 studies regarding sonographic WE were included [21]. A total of 33 studies were compared with regard to the scan-to-delivery interval (28 studies with an interval <7 days; 5 studies with an interval >7 days). The pooled accuracy was significantly different in studies where <7 vs. >7 days had elapsed between examination and delivery. The pooled estimates within 10 % of birth weight were 66 % for studies with a scan-to-delivery interval <7 days compared to 47 % with an interval of >7 days (OR 2.17; 1.93, 2.45).

In a study performed by Mongelli et al., 276 fetal WEs within 35 days before delivery were analyzed using one of the Hadlock formulas [19]. The weight estimation was either left unchanged or extrapolated to the time of delivery. MPE was −6.5 without any adjustment for the time interval and 5.9 after using the extrapolation method, showing no systematic underestimation any more when taking the influence of the time interval between estimation and delivery into account. In Comparison, in the present study MPEs of WEs performed between 29 and 35 days before delivery with the Hadlock II formula showed a clear underestimation of fetal weight (MPE −23.91), whereas estimations on the day of delivery did not show any remarkable underestimation (MPE −0.41).

No impact on the accuracy of the WE due to the time interval between investigation and delivery was reported by Benacerraf et al. [27]. In their study, however, only fetal weight estimations that were performed within a week before delivery were included. Overall, 74 % of the infants had birth weights within 10 % of the ultrasonographic estimates. In the present study WEs within the 10 % range lay between 67.20 in group 1, 68.73 in group 2 and 68.56 in group 3.

MPE values reflect the systematic deviation of a model from the actual BW. In formulas with a systematic underestimation (Hadlock, Warsof) of fetal weight, MPEs were closest to zero in WEs that were performed within 3 days before delivery. After that interval, values decreased continuously, indicating a rising underestimation caused by interim fetal growth. The fact that MPEs of WEs on the day of delivery showed a slightly larger underestimation than WEs with a scan-to-delivery interval of one to 3 days could be explained by an increasing descent of the fetal head into the maternal pelvis leading to inaccurate small measurements of the BPD, OFD and HC. In a study by Peregrine et al. WE with one of the Hadlock formulas performed in 262 women at term prior to induction of labor resulted in even more negative MPE values of −7.6 with random errors of 10.6 [16]. The authors analyzed sonographic WE in comparison with clinical WE by both doctor and the woman herself and concluded that clinical estimates of BW perform favorably compared with ultrasonographic estimates when performed immediately prior to labor.

In formulas with a systematic overestimation (Merz, Shepard), however, the interim fetal growth successively compensates the overestimation. Therefore, MPEs of these equations became most accurate in WEs with a scan-to-delivery interval between 8 and 14 days.

The standard deviation of the MPEs demonstrates the random error of WEs. The largest values with most of the formulas were found in WEs performed on the day of delivery. This could be explained by several problems accompanying WE close to delivery like unfavorable fetal positions or decreasing amount of amniotic fluid.

The MAPE takes systematic and random errors into account. Using the Hadlock and Warsof formulas, the lowest MAPE values were found for WEs within 1 week before delivery (group one, two and three). After an interval of more than 7 days, however, WEs became increasingly inaccurate. While nearly 70 % of estimates were within 10 % of BW in WEs that were performed within 1 week before delivery using the Hadlock I formula, these rates dropped to less than 56 % after an interval of more than 7 days and decreased down to nearly 35 % after more than 2 weeks. In comparison, formulas with a systematic overestimation (Shepard, Merz) showed a significant increase of MAPEs after an interval of more than 14 (Shepard) or 21 days (Merz).

In this study we evaluated five commonly used WE formulas. However, all of these formulas were developed using sonographic WEs with different scan-to-delivery intervals (Warsof et al. and Shepard et al. <3 days, Merz et al. <7 days, Hadlock et al. <8 days). This fact might, additionally to the above mentioned, explain the varying effect of the scan-to-delivery interval on the accuracy of the different WE formulas.

In conclusion, the influence of the scan-to-delivery interval on the accuracy of sonographic WE depends on the systematic error of the applied WE formula. While equations with a systematic overestimation of fetal weight showed the highest accuracy in WEs with a scan-to-delivery interval of up to 3 weeks, formulas with an underestimation yielded the best results in WEs that were performed within 1 week before delivery.

For clinical practice we therefore recommend, when using the Hadlock formulas, to repeat a sonographic WE if the scan-to-delivery interval exceeds 7 days.