Introduction

Although rates of infant mortality in the United States have declined over the last several decades, rates of preterm birth (birth prior to 37 weeks of gestation) and low birth weight (less than 2,500 g) have been slowly increasing over time [1]. Access to and adequate use of quality prenatal care is an important correlate of maternal and child health outcomes, and may therefore help reduce the odds of infant mortality, preterm birth, and low birth weight [2, 3]. Although some of the known risk factors for preterm birth, low birth weight, and infant mortality are not themselves malleable (e.g., history of heart disease), many are (e.g., managing hypertension or diabetes), and might therefore be addressed in prenatal care settings.

Over the past two decades, the utilization of prenatal care delivered in group formats has increased, particularly the CenteringPregnancy (CP) model of group prenatal care [4]. The CP model involves groups of 8–12 women in similar gestational ages meeting with licensed health care providers for 10 sessions of approximately 90–120 min each. The CP model is in contrast to more common forms of prenatal care that are delivered in an individual format where women attend visits with obstetric providers and have little opportunity to interact with other pregnant women. The CP group prenatal care model has been theorized to produce better birth outcomes than traditional individually delivered prenatal care due to increased patient-provider interaction, increased social support, greater perceived empowerment, and increased exposure to useful skills and information about pregnancy, birthing, and childcare.

To date, two randomized control trials have compared infant mortality, preterm birth, and low birth weight outcomes for women who received CP versus traditional individually delivered prenatal care [5, 6]. Neither of these trials found differences between CP and traditional care participants in the odds of fetal demise, total gestational age, total birth weight, or the odds of low birth weight. One of those trials [6], which included 322 women at two military settings, found no evidence that the odds of preterm birth were different for women in CP versus individually delivered prenatal care. However, the other randomized trial, a multi-site study of 993 women [5, 7], found that the odds of preterm birth were significantly lower among women in CP than those participants in individually delivered prenatal care. Several other non-randomized studies with relatively small sample sizes (less than 400 CP participants) have compared birth outcomes for women in CP versus individually delivered prenatal care, with varying results [814]. For example, findings from the largest matched multisite study (N = 458) indicated that CP participants had significantly higher birth weight than participants in traditional prenatal care, but found no significant differences in preterm birth, low birth weight, very low birth weight, or neonatal deaths [10].

Given the limited and inconsistent findings to date [15], the objective of this study was to provide further evidence on this matter by examining the effects of group-delivered (CP) versus traditional individually delivered prenatal care on gestational age, birth weight, and fetal demise outcomes among women receiving prenatal care at five sites in Tennessee. Prior studies have been limited by relatively small sample sizes; this study contributes to the literature with one of the largest samples of CP participants to date, which permits exploration of rare adverse birth outcomes. This study also addresses a major limitation of prior non-randomized studies by using statistical matching procedures to ensure the equivalence of groups of women receiving different formats of prenatal care on a wide range of baseline characteristics.

Methods

Sample

The sample included obstetric patients who received prenatal care at one of five sites that offered both individual and group prenatal care between 2008 and 2011. All five sites received grants from the Tennessee Governor’s Office of Children’s Care Coordination (later from the Tennessee Department of Health) to implement CP, and agreed to participate in an evaluation as part of their grant requirements. Two sites were affiliated with OB/GYN departments in large metropolitan hospitals (A, D); one was a community health center affiliated with a large metropolitan hospital (E); one was an independent faith-based community health center in a metropolitan area (B); and one was an independent rural birthing center (C). Although the sites were diverse in terms of client populations, setting, and specific content, all had a dedicated group space for CP at their facility, and all had at least two providers responsible for delivering CP (primarily CNM, MD, LPN, APN, and doulas) [16]. Four of the sites were approved as Centering sites by the Centering Healthcare Institute, and one was working toward approval. Although enrollment procedures varied across sites, most invited women to participate in CP at their first prenatal appointment unless they were deemed high risk due to prior medical conditions (see [16] for more details).

At each site, staff conducted a retrospective chart review to extract de-identified data from medical records for all women who had received prenatal care and delivered sometime between the month in which CP started at that site (generally mid-2008) and late-2011. De-identified electronic medical record data were extracted at two of the sites. For the other three sites, de-identified data were collected and managed using REDCap, a secure, web-based application designed to support data capture for research hosted at Vanderbilt University through the Vanderbilt Institute for Clinical and Translational Research (grant support 1 UL1 RR024975 from NCRR/NIH) [17]. The retrospective chart review used to collect medical record data on background and childbirth outcomes was part of a contracted evaluation and was not human subjects research. The current study is a secondary analysis of the de-identified dataset that was compiled during that contract evaluation, and was approved by the IRB at Vanderbilt University and conducted in accord with prevailing ethical principles.

Medical record data were extracted for 818 CP and 6,258 traditional prenatal care recipients across the five sites (n Site A  = 5,494; n Site B  = 726; n Site C  = 561; n Site D  = 197; n Site E  = 98). Because the sample was based on retrospective records, we could not randomly assign women to prenatal care conditions. Therefore, we used propensity scores to create a statistically matched group of women who received CP versus traditional prenatal care at each of the five sites [18, 19]. Propensity score matching attempts to reduce the impact of selection bias and confounding on estimated causal treatment effects in non-randomized observational studies [18]. Random assignment permits causal inferences because it ensures that treatment status is independent of baseline characteristics of participants, whether observed or unobserved. Propensity score methods attempt to achieve that result in non-randomized studies by matching or balancing groups on baseline characteristics, with the limitation that it can only do so on observed, i.e., measured, characteristics.

The data available from the medical records varied somewhat across sites, but in all sites the propensity scores were estimated using variables representing demographic characteristics, pregnancy and childbirth history, and risk factors for adverse pregnancy outcomes, e.g., age, race, insurance status, weight at entry, gestational age at entry, gravidity, parity, obesity history, hypertension history, substance use history. The particular variables included in the propensity score estimation models, therefore, also varied across sites, depending on availability in medical charts (see [16] for a complete list of variables used for matching at each site). Candidate variables for inclusion in the propensity scores that were missing data for more than 20 % of the cases at a site were not retained for further analysis; for those cases with fewer than 20 % missing, we imputed the missing values using an expectation–maximization algorithm [20]. Patients with rare medical conditions who could not be matched with another participant at the same site were excluded (n Site A  = 114; n Site B  = 22; n Site C  = 30; n Site D  = 40; n Site E  = 0). Patients who attended fewer than five prenatal care sessions (CP or traditional) over the course of the pregnancy were also excluded (n Site A  = 0; n Site B  = 254; n Site C  = 24; n Site D  = 5; n Site E  = 9).Footnote 1

Propensity scores were estimated separately at each site using a logistic regression model predicting the probability of patients participating in CP (versus traditional) prenatal care, conditional on demographic, medical, pregnancy, and childbirth histories. Patients were excluded if their estimated propensity scores fell outside the common support region where the two groups’ distributions overlapped (n Site A  = 19; n Site B  = 25; n Site C  = 315; n Site D  = 43; n Site E  = 21), which ensured similarity in the propensity score distributions across the two groups.Footnote 2 We then used a many-to-many matching procedure so that any CP patient who could be statistically matched with a comparable traditional prenatal care patient (and vice versa) was retained in the matched sample [18].Footnote 3 The quality of the matching on the individual variables incorporated in the propensity scores was assessed by examining pre- and post-matching means, standardized mean differences, and variance ratios [21, 22] and showed that acceptable covariate balance was achieved at all sites (see [16] for standardized mean differences and variance ratios for all variables by site). The final matched sample included 651 patients who received CP group care and 5,504 patients who received traditional individually delivered prenatal care.

Measures

Primary outcome data were extracted during the medical chart reviews. Gestational age was measured in weeks, and preterm birth was measured with a binary variable indicating whether gestational age at birth was less than 37 weeks (1 = yes; 0 = no). Birth weight was measured in grams; low birth weight was measured with a binary variable indicating whether birth weight was less than 2,500 g (1 = yes; 0 = no); very low birth weight was measured with a binary variable indicating whether birth weight was less than 1,500 g (1 = yes; 0 = no). Fetal demise data were available at only four sites; this outcome was measured with a binary variable (1 = yes; 0 = no). Missing data were not imputed for any outcome variables; cases without valid delivery data were not included within a given outcome analysis.

Data Analysis

Program effects at each site were estimated using weighted ordinary least squares (for continuous outcomes) and weighted logistic regression models (for binary outcomes).Footnote 4 To increase the total sample size and hence statistical power, we also estimated aggregate models that combined findings across all sites. The combined analyses across sites used multilevel mixed effects linear and logistic regression models that accounted for clustering within sites. The propensity scores were incorporated into the analyses in the form of a weighting function with weights equal to 1/propensity score for CP participants and 1/(1-propensity score) for traditional prenatal care participants [18, 21, 23, 24]. The purpose of the propensity score matching and weighting procedures was to reduce any bias in the effect estimates associated with differences between the CP and traditional care groups on the covariates included in the propensity score models. To safeguard against any remaining imbalance between groups on key background characteristics, and for face validity purposes, all outcome analyses additionally included age, race, and gravidity as individual covariates.Footnote 5

Results

Table 1 presents descriptive statistics on key background characteristics for the original unmatched sample of 818 CP patients and 6,258 traditional prenatal care patients, split by site. The propensity score matching procedure helped minimize the differences in background characteristics between the two groups, as shown in Table 2 for the final matched sample of 651 CP patients and 5,504 traditional patients. Table 3 shows the differences in the continuously measured birth outcomes (gestational age, birth weight) for women enrolled in CP versus traditional prenatal care at each site separately and for all five sites combined. Results indicated that women in CP prenatal care had significantly higher gestational ages than women in traditional prenatal care at Site A (b = .35, 95 % CI [.11, .59]), Site B (b = .65, 95 % CI [.12, 1.17]), and Site E (b = .77, 95 % CI [.08, 1.46]). At Site D women in CP had gestational ages almost one week shorter than women in traditional prenatal care, but this difference was not statistically significant (b = -.78, 95 % CI [−1.58, .03]). In the combined results across all five sites, women in CP had significantly longer gestational ages than women in traditional care—approximately one-third of a week longer (b = .35, 95 % CI [.29, .41]). Although favoring CP, this statistically significant overall effect has relatively little clinical significance.

Table 1 Demographic profiles of unmatched prenatal care participants, by site and prenatal care format
Table 2 Demographic profiles of statistically matched prenatal care participants, by site and prenatal care format
Table 3 Unstandardized regression coefficients and confidence intervals indexing differences in gestational age and birth weight for CP and traditional prenatal care participants

To further explore the potential clinical relevance of this effect, we conducted post hoc analyses examining the effect of CP prenatal care on gestational age only for those women who experienced adverse birth outcomes during the current pregnancy—i.e., preterm birth or low-birth weight. Results indicated a statistically and clinically significant effect of CP on gestational age for those participants with preterm births (b = 2.56, 95 % CI [2.44, 2.68]), and with low birth weights (b = 2.24, 95 % CI [1.90, 2.59]).Footnote 6 These effects were equivalent to two-week longer gestational ages for CP participants with preterm birth or low-birth weight.

Analysis of total birth weight indicated no significant differences between CP and traditional care patients at any individual site.Footnote 7 However, combined results across all sites indicated significantly higher birth weights for women in CP, almost 30 g average difference between the groups (b = 28.6, 95 % CI [4.8, 52.3]). Given the relatively small clinical magnitude of this effect, we again conducted exploratory post hoc analyses focusing on women with preterm births or low birth weight newborns. Results indicated a statistically and clinically significant effect of CP on birth weight for women with preterm births (b = 368.1, 95 % CI [278, 458.3]) or low birth weights (b = 339.5, 95 % CI [284, 395]), equivalent to over 300 additional grams of birth weight for CP participants who experienced these adverse birth outcomes (see Table 3).Footnote 8

Despite the observed differences in total gestational age, Table 4 indicates minimal differences between CP and traditional care patients in the odds of preterm birth. Preterm birth rates were higher among CP patients at some sites and higher among traditional care patients at other sites—but most of those differences were not statistically significant. The odds of preterm birth were significantly lower for CP participants at Site E, however, where all 6 preterm births were among women enrolled in traditional prenatal care (Fisher’s exact test p = .02). The combined results across all five sites nonetheless indicated no significant difference in the odds of preterm birth for women in CP versus traditional prenatal care.

Table 4 Odds ratios and confidence intervals indexing differences in birth outcomes for CP and traditional prenatal care participants

Results indicated no significant differences between CP and traditional care patients in the odds of low birth weight at the individual sites (see Table 4). However, combined results across sites indicated significantly lower odds of very low birth weight babies for CP patients than traditional care patients (OR = .21, 95 % CI [.06, .70]). Holding age, race/ethnicity, and gravidity constant, this effect is equivalent to a prevalence of .08 % very low birth weight babies among CP participants versus .30 % among traditional care participants. This finding was largely driven by the results at Site A where the odds of a woman in CP having a very low birth weight baby were notably lower than the odds for a woman in traditional prenatal care (OR = .14, 95 % CI [.04, .43]).

As shown in the last section of Table 4, fetal demise was rare and only occurred at the two largest sites (A & B). At Site A, all 68 instances of fetal demise occurred for women in traditional prenatal care (Fisher’s exact test p = .02). The incidence of fetal demise was lower for CP participants at Site B, but the difference was not statistically significant. The combined analysis showed that overall CP participants had significantly lower odds of fetal demise than their matched counterparts in traditional prenatal care (OR = .12, 95 % CI [.02, .92]). Controlling for the covariates, this effect is equivalent to a .17 % prevalence of fetal demise among CP participants versus 1.32 % prevalence among traditional prenatal care participants.

Discussion

This study compared birth outcomes for women who received two different forms of prenatal care at five sites in Tennessee. Results indicated that women in CenteringPregnancy (CP) group prenatal care, compared to women in traditional individually delivered prenatal care, had significantly longer gestational ages and higher overall birth weights. The significant effects of CP on gestational age and birth weight were relatively small in clinical terms, equivalent to an additional one-third of a week in gestation and 29 g in birth weight. However, post hoc exploratory analyses revealed that the significant beneficial effect of CP on gestational age was stronger for participants who delivered preterm, such that CP was associated with 2.56 weeks longer gestational age—a substantial and clinically significant effect. Similarly, despite the relatively small beneficial effect of CP on total birth weight, this effect was more pronounced in low-birth weight infants, with CP being associated with 368 g of birth weight higher than traditional care. Results also indicated that CP was associated with significantly and substantially lower odds of very low birth weight and fetal demise, but findings provided no evidence of differences between the CP and traditional prenatal care participants in the odds of preterm birth or low birth weight. Further, at none of the sites did CP participants have consistently worse birth outcomes than their matched counterparts in traditional prenatal care. Thus, results indicated largely beneficial effects of CP prenatal care on women’s birth outcomes at these five Tennessee sites.

This study’s finding of a non-significant effect of CP on low birth weight was consistent with previous research [5, 6, 10, 11, 14]. However, contrary to prior randomized controlled trials [5, 6], we found significant differences in total gestational age and birth weight; and, unlike three previous studies, we did not find a beneficial effect of CP on the odds of preterm birth [5, 13, 14]. The discrepant findings from the current study could be due to inadequate matching procedures, but could be associated with other factors such as higher statistical power (i.e., because of the large sample size), variation in participant populations, implementation procedures, or simply sampling error.

When interpreting these findings, we must acknowledge the study’s strengths and weaknesses. The primary strengths of the study were the large aggregate sample size across sites and the use of rigorous statistical matching procedures to create groups of women enrolled in CP or traditional prenatal care that were equivalent on a wide range of relevant baseline characteristics. The large sample size permitted examination of very low birth weight and fetal demise—outcomes often unobserved in smaller studies [5, 6]. The primary weakness of the study was the lack of random assignment. Although we used propensity scores to balance participants on many relevant baseline characteristics, we have no assurance that all potentially biasing variables have been included. Reliance on retrospective chart reviews inherently limited the availability of variables for use in the propensity score estimation, meaning that some variables that may have improved the balancing were omitted (e.g., transportation, work schedules, history of periodontal disease).

Furthermore, enrollment procedures used at most sites dictated that women were only able to enroll in CP if they were not deemed at “high risk” for adverse pregnancy outcomes. Although different sites used different definitions of high risk that would disallow CP participation, common exclusions were histories of preterm birth, low birth weight, cesarean births, diabetes, lupus, heart disease, or other prior medical conditions. Because the propensity score methods necessitated the exclusion of participants that could not be matched to equivalent participants receiving a different format of prenatal care, prenatal care participants with high medical risk were not well represented in this study and results should not be generalized to such populations. We did, however, conduct exploratory sensitivity analyses (not shown here) of the effects of CP on birth outcomes for women with different levels of medical risk. The results indicated that at the largest site (A), the beneficial effects of CP on birth outcomes were slightly larger for women with more medical risk factors [16]. These findings are, at best, only suggestive, particularly because they were not observed at any of the other sites. They do, however, indicate that the relatively favorable effects found for CP are unlikely to have resulted from inadvertent inclusion of more high-risk cases in the traditional care sample through inadequate matching.

Findings from the current study therefore add to the accumulating evidence of beneficial effects of CP group prenatal care, even among participants with preterm or low-birth weight outcomes. Despite these promising findings, and as noted in a recent systematic review [25], limited research exists on the underlying mechanisms behind these effects. Because we could not collect process data on how prenatal care was delivered at each site, and due to limited antepartum data related to maternal behaviors, we were unable to examine the mechanisms by which CP yields beneficial effects on birth outcomes. The logic model of CP prenatal care suggests that the personalized care, participatory feedback, and social support involved in CP may yield beneficial maternal and infant effects, but how those mechanisms actually operate is unclear. For instance, personalized care may improve adherence to care and promote healthy group norms that lead to maternal lifestyle changes (e.g., improved diet, exercise, smoking cessation, abstinence from alcohol/drugs, healthy gestational weight gain), which may in turn lead to longer gestational ages and higher birth weight. Increased social support may also improve mental and physical health, particularly among women under stress [4, 26, 27]. Although we are unaware of any research that has explicitly examined the causal pathways by which CP may lead to improved birth outcomes, prior studies have indicated that women in CP have higher levels of satisfaction with prenatal care, better knowledge about pregnancy and infant care, and receive more adequate prenatal care than women in traditional individually delivered care—all factors that might contribute to the beneficial effects observed in the current study [5, 8, 28, 29]. More research is therefore needed to examine causal pathways to identify the mechanisms by which CP may affect infant and maternal health outcomes; such knowledge could be used to identify key kernels or components of care that might be adopted in other prenatal care settings.

Health policy reforms aimed at reducing adverse birth outcomes may consider group prenatal care a promising alternative format for delivering prenatal care. Widespread adoption in clinical settings may be premature, however, until more research has examined why group prenatal care may have beneficial effects, for what client populations those effects are strongest/weakest, and the cost and cost-effectiveness of different formats of prenatal care. Whereas some studies have found that delivering prenatal care in group versus individual formats is cost neutral [5], others have suggested that group care may offer financial advantages in larger facilities with adequate patient volumes, but may not be cost effective and may adversely affect productivity in small, rural facilities [30]. Thus, more research is needed to examine the costs associated with different prenatal care delivery models across diverse healthcare settings, and subsequent implications for proposed health policy reforms. Given the accumulating evidence of beneficial effects of CP group prenatal care, understanding the mechanisms behind such effects and the cost implications of widespread implementation will be critical for informing state and local health policies aimed at improving maternal and perinatal health outcomes.