Introduction

Induction of labor (IOL) is a relatively common procedure in Obstetrics with rising rates throughout the world [1]. In the United Kingdom it has been reported that 19.8 % of all deliveries in 2004–2005 were induced with less than two in three women giving birth without further intervention, and 15 % having instrumental births and 22 % having emergency cesarean deliveries [2, 3]. In the year 2011–2012 the IOL rate had risen to 22.1 % with a mean emergency cesarean section (CS) rate of 30.2 % for primiparous and 13.2 % for multiparous women whose labors were induced [4, 5]. National clinical guidelines in the UK recommend that induction of labor should be offered to those women when it is likely that a better outcome will result if labor is initiated than if the pregnancy continues [2]. The reason for this recommendation is that induced labor may be less efficient than spontaneous labor and an assisted delivery is more likely to be required [2, 6]. It has been reported that women who have been induced are twice as likely to be delivered by cesarean section than women with spontaneous onset labor [7]. Also, if an emergency CS is required this may lead to increased maternal morbidity and mortality with consequences in future pregnancies such as risk of uterine rupture and placenta previa [810]. Moreover, the assisted delivery associated with induced labor has been shown to lead to increasing costs for the health system with estimates of an uncomplicated CS delivery costing ₤704 more than a vaginal birth [11, 12].

There have been many attempts in literature to identify those women most likely to have a successful vaginal delivery after being induced for various indications. Some of these reports have focused on risk factors determining the success of the IOL process such as parity, cervical score and method of induction [1]. Other studies have highlighted the risk factors that are related to the failure to progress in active labor, such as malposition and a high body mass index (BMI) [1, 13]. In our population-based study the primary outcome was to identify the predictors that increased the risk of cesarean section delivery in women being induced for all medical reasons. The secondary outcome was to combine these predictors into a risk-assessment calculator that will facilitate the clinician in everyday obstetric practice.

Materials and methods

This was an observational cohort study of women induced for various medical reasons at the Maternity Unit of the Shrewsbury and Telford Hospital (SaTH) National Health Service (NHS) Trust, between January 2007 and December 2013. Women with singleton cephalic presentation deliveries induced for all medical indications were considered eligible for the study. Women induced for stillbirths, fetal congenital abnormalities, and with multiple pregnancies were excluded. Data was collected from Medway® obstetric database and the IOL indication, maternal data, labor/delivery data and neonatal data were recorded.

The indications for IOL were categorised into two groups: post-date pregnancy (gestational age >41 weeks) and non post-date pregnancy (gestational age <41 weeks). In the non post-date pregnancy group we included the following indications: reduced fetal movements, fetal growth restriction, pregnancy-induced hypertension, preeclampsia/eclampsia, gestational diabetes, term (>37 weeks) prelabor rupture of membranes for more than 24 h, preterm (<37 weeks) prelabor rupture of membranes, obstetric cholestasis, any other medical condition.

Maternal demographic and clinical data included age, parity, gravidity, BMI at booking (normal 18.5–24.9 kg/m2, overweight 25.0–29.9 kg/m2, obese BMI ≥30.0 kg/m2), ethnicity (white European, Black, Asian, other) and smoking status. Labor and delivery data included gestational age at delivery, route of birth (normal vaginal delivery, instrumental vaginal delivery, cesarean section delivery), indications for cesarean section delivery, epidural use and liquor appearance (normal, meconium stained). Neonatal data involved fetal gender (male, female), birth weight, head circumference, Apgar scores (at 1 and 5 min), cord gases taken at delivery (arterial, venous) and admission to the neonatal unit.

Quantitative variables are expressed as mean values (SD) or as median values (interquartile range). Qualitative variables are expressed as absolute and relative frequencies. For the comparisons of proportions Chi square tests were used. Student’s t tests were computed for the comparison of mean values when the distribution was normal and Mann–Whitney test for the comparison of median values when the distribution was not normal.

Univariate logistic regression analysis was used in order to explore the association of study variables with the risk for cesarean section. In order to find factors independently associated with the risk for cesarean section, a multiple logistic regression analysis was performed in a stepwise method. Odds ratios with 95 % confidence intervals were computed from the results of the logistic regression analyses. The scores were then assigned according to the β values of each variable in the new logistic regression model to establish the risk scoring system of the cesarean section [14, 15].

Model diagnostics for the risk factors model were evaluated using the Hosmer and Lemeshow statistic [14]. The prognostic ability of the risk prediction model and its scoring system were evaluated with the receiver operating characteristic (ROC) curve [14, 15]. Also, ROC analysis was used to find the optimal cut-off of risk scoring system for the prediction of cesarean section. The overall performance of the ROC analysis was quantified by computing the area under the curve (AUC). The accuracy was evaluated with the sensitivity, specificity, positive and negative predictive values [14].

For the internal population verification of the cesarean section risk scoring system the non-parametric Bootstrap method was used [14, 16]. The bootstrapping procedure involved sampling with replacement from the original data with the same sample size, to generate bootstrap samples and then ROC analysis of the scoring system was performed on such samples, repeatedly after 1000 iterations of this procedure. The AUC of the ROC for the model in the bootstrap sample was estimated along with 95 % confidence interval. All p values reported were two-tailed. Statistical significance was set at 0.05 and analyses were conducted using STATA statistical software (version 11.0).

As this was a study with retrospective collection of data from an electronic database and therefore it was not possible to go back and collect any data that was missing for the patients, we therefore included only women with complete data in all study parameters for the construction of the risk-assessment model and the final calculations.

Ethical approval for collection and management of data in our study was obtained by the Research and Development Department of the Shrewsbury and Telford Hospital NHS Trust and from the National Research Ethics Service (NRES) committee of West Midlands-Coventry & Warwickshire.

Results

From January 2007 till December 2013, n = 7390 eligible women were identified fulfilling the inclusion criteria of our study. Nevertheless, full data on all study parameters were available for 6169 women who consisted the final group of women included in our model and used for the final calculations.

The sample consisted of 6169 women with a mean age of 28.9 years (SD 6.2 years). The sample characteristics are shown in Tables 1 and 2. In our cohort, 54.2 % of women were under 30 years of age and primigravida were 35.5 %. White-European ethnicity involved 90.8 % of the cohort, 83.6 % were non smokers and 41.7 % were of normal BMI. Post-dates pregnancy as the reason for IOL involved 32.4 % of women and the cesarean section rate was 13.3 %. The indication for the CS delivery was recorded on the electronic database in 61 % of the women in our sample, with failed IOL, failure to progress in labor and other indications being 16.3, 41.6 and 42.1 %, respectively. A total of 3269 males and 2900 female infants were delivered, with 4.9 % requiring admission to the neonatal unit.

Table 1 Maternal demographics and labor characteristics (n = 6169)
Table 2 Neonatal characteristics in the sample (n = 6169)

Univariate logistic regression analysis for the risk of cesarean section (Table 3) showed that increased maternal age, being overweight or obese, meconium stained liquor, epidural use, primiparity and non post-date pregnancy were associated with a greater likelihood for cesarean section. Additionally, black ethnicity women had greater odds for having cesarean section. Also, the odds for cesarean section were greater for male fetuses in univariate analysis.

Table 3 Odds ratios (OR) and 95 % confidence intervals (95 % CI) derived from univariate logistic regression analyses for cesarean section delivery

When multiple logistic regression analysis was conducted in a stepwise method it was found that maternal age, being overweight or obese, having black ethnicity, meconium stained liquor, epidural use, primiparity, non post-date pregnancy and male fetal sex were the independent risk factors for cesarean section (Table 4). The goodness of fit test of the Hosmer–Lemeshow results showed that the risk assessment model had a good prediction consistency (p = 0.298). The AUC of the ROC curve of the model was 0.76 (95 % CI 0.74–0.78, p < 0.001) indicating that the risk factors’ regression model had a good predictive ability.

Table 4 Odds ratios (OR) and 95 % confidence intervals (95 % CI) for the risk model derived from stepwise multiple logistic regression analysis for cesarean section delivery

The risk scoring system for cesarean section based on the final regression model was composed of the eight aforementioned variables and the score assignment for each risk factor is shown in Table 4. The final score could range from 0 to 32 points. ROC curve analysis showed that the optimal-cut off of risk score for the prediction of cesarean section was 11 with sensitivity equal to 75.8 %, specificity equal to 65.1 %, negative predictive value equal to 93.8 % and positive predictive value equal to 25.0 %. The area under the curve (AUC) was 0.76 (95 % CI 0.74–0.78) which significantly differs from 0.5 (p < 0.001). The validation results of the Bootstrap method were good and showed that the AUC of ROC curve of the scoring system was 0.76 (95 % CI 0.74–0.78, p < 0.001). Figure 1 shows the observed versus predicted probabilities of cesarean section delivery thus demonstrating a well-calibrated model.

Fig. 1
figure 1

Calibration plot of the model applied on 6169 women (goodness of fit p = 0.298). The straight line represents the predicted CS rate and each circle represents the observed CS rate while larger circles indicate that these points are based on more data

Table 5 presents the proportion of cases with cesarean section according to different risk score ranges. With the increasing scores the proportion of cases with cesarean section tended to increase. The percentage of cases with cesarean section was 5.4 % for a risk score less than 11, whereas it increased to 25.0 % for a risk score equal to 11 or more. 41 % of women with a risk score ranging from 18 to 24 had a cesarean section, while the proportion of women with cesarean section further increased to 62.5 % in cases with a risk score from 25 to 32. Among the whole population and the low-risk and high-risk subgroups, the numbers of women with induced labor having a CS delivery was one in eight women, one in twenty-one women and one in four women, respectively. 75.8 % of women having a CS delivery were in the high-risk population.

Table 5 Proportion of cases with caesarean section according to different risk score ranges

Discussion

In our study primiparity was the strongest risk factor in the risk assessment model for the prediction of CS delivery in women being induced with the highest score of eight being assigned to it. Parity is a well recognised risk factor in the literature for predicting the success of an induction of labor [17]. It is suggested that the odds for cesarean section delivery for failed IOL are 75 % higher in nulliparous women compared to multiparous women with the same cervical length [18]. Other studies have also shown that primiparity is a risk factor for failed IOL [19, 20]. A recent study from the United States in 2013 reported that in a total of 228,562 deliveries primiparity was a significant risk factor for failure to progress in labor leading to an emergency CS delivery [21].

The second strongest risk factor for an increased CS delivery outcome in our cohort was women of black-ethnicity. This finding is consistent with other studies that have shown that after adjustment for confounding factors such as parity, BMI and age, there is a strong association between black ethnicity and CS outcome in women with spontaneous and induced labors [22, 23].

In our study, women over 30 years of age had an increased risk of CS delivery when offered induction of labor. A large UK-based study in 2008 involving 583,843 nulliparous women reported that the CS delivery rate among nulliparous women aged 30–34 years increased threefold whereas over 35 years it increased sevenfold with similar associations observed in multiparous [24]. The study population involved both spontaneous and induced labors and their results suggested a biological effect on labor performance due to advanced maternal age rather than obstetrical intervention or maternal preference. Moreover, the authors of this report performed in vitro studies of uterine smooth muscle and concluded that increasing maternal age was associated with reduced spontaneous activity and impaired myometrial contractility.

Another risk factor we identified was that the increasing BMI of women in our cohort led to an increased risk of CS delivery outcome. There are many reports on the effect of increased BMI on both the failure to induce labor and the failure to progress while in active labor. Recent studies suggest that the increased BMI due to the adipose tissue being hormonally active predisposes to a reduced response to induction of labor and also may inhibit the spontaneous initiation of labor because of the altered metabolic status of the overweight or obese woman [25, 26]. A large-population based study including 287,213 pregnancies in London published in 2001 showed that the increasing maternal BMI was associated with multiple antenatal complications, adverse neonatal outcome and increased emergency CS delivery [13]. More recent studies have also confirmed that BMI increase is a risk factor for assisted delivery in spontaneous or induced labor [27].

In our study inducing women for post-dates pregnancy (>41 weeks of gestation) was found to have lower CS rates in comparison to pregnancies that were induced earlier for any medical reason. This is in contrast with the literature where a systematic review in 2009 reported that increasing gestational age resulted in increased CS rates [17]. Other recent studies suggest that inducing women at term does not increase the CS rates when compared to expectant management or induction of labor at 41 weeks for prolonged pregnancy [28, 29]. This discrepancy among the various studies most likely reflects the different study population characteristics and the different study designs that have been applied to measure the increase or decrease in the CS rates in induced labor.

Meconium-stained liquor was also associated with CS delivery in our cohort. A recent study has reported that meconium presence in liquor is a function of the duration of labor rising from 2.8 % in women prior to the onset of labor in elective CS to 23.1 % of women in active labor [30]. It is contemplated that meconium-stained liquor is a sign of fetal hypoxia and has been associated with lower Apgar scores and higher rates of assisted delivery [30, 31].

The use of epidural analgesia was a significant predictor for CS outcome in our study. On review of the literature however there is inconclusive evidence on the effect of epidural analgesia in labor [32]. Moreover, the studies reporting on the effect of epidural analgesia on the mode of delivery have included women with spontaneous onset of labor and not induced labor [32].

Finally, the male fetal gender was a risk factor leading to increased CS rate in our cohort of women. There are few reports so far which suggest that there are different steroidal pathways regarding the onset of labor between female and male fetuses [25]. It is suggested that there is a different fetoplacental response to the induction of labor process between the fetal genders which could explain the higher IOL failure rates in male fetuses [33]. It is reported that these differences are responsible when there is a male fetus of the greater time interval between induction and delivery and the higher incidence of augmentation of labor with oxytocin [33]. Also, there is strong evidence that male and female fetuses are different in terms of growth and development in utero due to sex-biased gene expression leading to divergent growth patterns [34]. It has been reported that male fetuses grow faster in utero and have a greater birth weight in comparison to female fetuses with the same placental size [35]. The consequence of this however is that due to their higher tissue mass, male fetuses may have increased oxygen consumption needs. Moreover, when males are subjected to the stress of labor they may have less placental reserves to utilise if sub-optimal conditions appear [34]. This explains the finding why male fetuses have a higher incidence of fetal distress as reflected in being more likely to have fetal blood sampling, cardiotocographic abnormalities, lower Apgar scores, and a higher incidence of operative delivery [36, 37].

There are certain limitations to be considered about our study. First, data were retrospectively collected from an electronic database for the study period 2007–2013. Accuracy of data is dependent on the practitioner recording each time the information on the database. Because of the large number of eligible patients (n = 7390) it was not possible to go back and collect any missing data from the patients’ hospital notes. Nevertheless, there was still a large-cohort size to generate clinically meaningful results and to construct the risk assessment model even after including only women with complete data for all study parameters (n = 6169).

Second, our electronic database did not contain mandatory fields for recording the cervical status. It is well-established in the literature that cervical scoring in terms of the Bishop score is a key factor to the success of IOL [1, 19, 38]. Therefore, we could not incorporate these variables in our risk-prediction model and adjust for any confounding effect.

Third, our primary outcome was the overall CS delivery rate in the cohort of women being induced for all medical reasons. This means that the CS group involved women that had a failed IOL meaning that they were induced but never went into active labor, and also women who went into active labor but failed to progress and give birth. We chose this combined outcome so as to generate a larger number of subjects to be incorporated into the mathematical models. A sub-group analysis of women with failure to progress in labor and of women with failed IOL with the simultaneous use of cervical Bishop scoring may have led to different results.

Fourth, the indications for the IOL were grouped into the categories of post-date (>41 weeks) and non post-date pregnancy indications (<41 weeks). This categorisation is indicative more of the gestational age at induction than the actual indication for IOL itself. Due to the several indications for IOL it was not possible to break down the results per specific indication as this would have led to multiple risk factors of CS delivery per indication and to less meaningful conclusions.

Fifth, our maternity unit at the Shrewsbury and Telford Hospital has a relatively lower CS rate of 13.3 % in women offered IOL in comparison to the national average of 22 % even in 2004–2005. Our unit should be considered an outlier with a strong culture for vaginal delivery and the least obstetric interventions. In our unit women are allowed sufficient time to have labor progress to optimise their chances of vaginal delivery. This means that women in other units would have had an emergency CS much earlier for failed IOL or failure to progress in labor, whereas in SaTH there is a higher threshold before declaring a failure in the induction process or in labor progress. If our study had been replicated in other units with higher CS rates in women offered IOL then the risk factors and the risk-prediction model may have been different.

The main strength of our study is that it has included a large number of patients (n = 6169) with complete data for all study variables. This has led to a well calibrated risk-assessment model with good distinguishing abilities and accuracy. Moreover, six out of the eight risk factors of the model are antenatal factors (age, BMI, parity, ethnicity, post-dates/non post-date pregnancy, fetal gender) whereas the remaining two factors are intrapartum (epidural analgesia, meconium-stained liquor), thus making estimation of the total score in the risk-assessment model easy and simple. Finally, if a woman being induced is considered low-risk for a CS delivery, due to the high negative predictive value (93.8 %) of the model this allows for a low misdiagnosis rate and provides reassurance for the high likelihood of a vaginal delivery.

Conclusion

In a large-population based cohort study of women induced for all medical reasons, we have identified six antenatal risk factors for CS delivery (age >30 years old, BMI >25 kg/m2, primiparity, black-ethnicity, non post-date pregnancy, male fetal gender) and two intrapartum risk factors (meconium-stained liquor, epidural use). These risk factors combined in a scoring system have led to a risk assessment model with a good prediction consistency and distinguishing ability. Due to the high negative predictive value of the model (93.8 %) when a woman is considered low-risk for CS delivery then this has a low misdiagnosis likelihood. The practical implications of this study involve a simple and easy to use prediction tool for the practitioner in everyday clinical practice. Its goal is not to increase the cesarean section rates by classifying women as high-risk from the very beginning of the induction process but to provide reassurance to those women who are low-risk for CS delivery. Further studies are needed with even larger cohorts so as to include risk-assessment per indication of IOL and also other study variables such as the cervical Bishop score.