Introduction

Substantial evidence exists that breastfeeding imparts numerous health benefits to both the child and mother [1, 2]. Based on this evidence, the American Academy of Pediatrics and the American Dietetic Association recommend exclusive breastfeeding of almost all infants until 6 months of age and complementary breastfeeding through the balance of the infant’s first year [3, 4], while the World Health Organization recommends breastfeeding until at least 2 years of age [5]. Despite these clear recommendations, breastfeeding initiation and duration among women in the United States are low, with 74% initiating breastfeeding and only 42% continuing until their infants are 6 months of age. These rates are even lower for low income women; for example, only 67% of women enrolled in the Supplemental Nutrition Program for Women, Infants and Children (WIC) initiate breastfeeding and only 33% continue to 6 months [6]. In Michigan, the breastfeeding rates for WIC mothers are lower still, with 49% initiating and 15% continuing until 6 months [7]. Increasing breastfeeding, particularly among the low income population, is listed as one of the national public health objectives in Healthy People 2010 [8].

Support programs to increase breastfeeding among low income women have been implemented by many US public health departments. These programs are often based on the United States Department of Agriculture’s (USDA) Food and Nutrition Service’s “Loving Support™” model [9], which uses mothers from the community with breastfeeding experience, and who have received training as peer counselors. Although many studies have concluded that such peer counselor (PC) support programs improve breastfeeding outcomes, recent reviews have noted that most of the studies do not use convincing analytic methods to uncover the causal effect of the programs [1012]. The only randomized controlled trial for a disadvantaged U.S. population is for urban, low income, Hispanic women [13]. Findings from the study show that the program increased breastfeeding rates at initiation (22.7 vs. 8.9%), 1 month, and 3 months, but not at 6 months. Despite the lack of empirical evidence, $14.9 million in fiscal year (FY) 2004 was appropriated in the USDA budget for breastfeeding support programs for WIC participants and another $14.8 million was appropriated in FY 2005 and FY 2006 [9].

The purpose of this study was to estimate the effectiveness of a PC breastfeeding program for low income women in Michigan. The effectiveness was assessed by a quasi-experimental design in which we exploit the fact that, because there was substantially more demand for program services than what could be provided, many expectant mothers who requested service were not subsequently contacted by a PC. The use of this quasi-experimental design has the potential to eliminate the bias that typically exists when instead one compares participants to non-participants. We first examined whether the contact process was consistent with our quasi-experimental interpretation. Given an affirmative finding, we estimated the causal effect of the support program on breastfeeding initiation and duration. The analysis used data from several programmatic and state administrative data sources for women from five Michigan counties who requested services from the PC program during the years 2002–2004 and were enrolled in WIC. Our final sample included 336 women who requested services prenatally and were contacted by a PC (the treatment group) and 654 women who requested services prenatally but were not contacted by a PC (the control group).

Methods

The Breastfeeding Initiative: A Breastfeeding Support Program

The Breastfeeding Initiative (BFI) is a collaboration between the Michigan’s Women, Children and Infant (WIC) Program and Michigan State University Extension. The program operated in 17 Michigan counties in 2002 and then expanded to 22 counties in 2004, the period we analyzed (P. Benton, BFI program leader, oral communication, February 2008). The program provides breastfeeding education and support to low income women through peer counselors. The PCs are recruited from the community; must have obtained a high school education or equivalent, have access to a means of transportation, and express a positive breastfeeding experience with their baby; and are provided training on how to provide breastfeeding support.

Potential program participants are recruited by personnel at WIC clinics and are asked to fill out a referral form. The PCs then use the referral forms to contact women to provide program services. For women who are served, the PCs provide at least one contact to mothers in person, with subsequent contacts in person or by telephone based on the type of support needed. The subsequent contacts are at least monthly. Program data from this time period indicate that participants enrolling prenatally received on average 3 home visits, 2 personal contacts in locations other than the home, and 6 telephone contacts during their participation in the program. Mothers remain in the program until they discontinue breastfeeding, the baby is 1 year old, or support services are no longer desired. Prenatally enrolled women participated in the program for an average of 24 weeks. A detailed description of the program and a qualitative evaluation from the perspective of PCs and participants have been previously published [14].

Estimation Strategy

Our interest was to determine the causal effect of receiving the PC support program on various breastfeeding outcomes. A naïve estimate could be obtained by comparing a breastfeeding outcome for participants in the program to non-participants. However, to the extent that mothers who would have breastfed in the absence of the program were more likely to participate in the program, then such a naïve estimate could overstate the effectiveness of the program; the difference in breastfeeding rates between participants and non-participants would reflect both the effectiveness of the program and the higher motivation of the participants. In such a situation, the naïve estimate is said to be plagued by endogeneity bias. A common solution to endogeneity bias is to rely on an experimental design, where a group of individuals are randomly assigned to be either in a treatment group or a control group.

Although a true experimental design was not built into the BFI program, a feature of the program existed that closely approximated an experiment. Specifically, there was substantially more demand for the services of the program than could be provided, so PCs contacted only some of the individuals who filled out a referral form. Any selective contact, at most, could have been based on the limited amount of information available on the referral forms. Under the assumption that who PCs contacted is independent of underlying breastfeeding propensities (the unobservable component of breastfeeding conditional on referral form information), we could compare those who requested service and were contacted, the treatment group, to those who requested service and were not contacted, our control group, to obtain the causal effect of the BFI program. Fortunately, we were able to explore the validity of this key assumption with our data. We provide these results below.

We tested for differences in pre-program characteristics between the treatment and control groups with multiple linear regression. To isolate treatment-control differences within county, the level at which the programs are administered, we included county indicator variables in all regressions. We tested for differences in outcomes between the treatment and control groups also using multiple linear regressions, including other explanatory variables to adjust for any differences that existed between the two groups and to increase the precision of our statistical tests. All reported P-values are based on two-tailed t-tests, with significance denoted at P < 0.10, P < 0.05, and P < 0.01. For dichotomous outcome variables, we re-estimated the models using logistic regression; because none of our substantive findings were different from these models, we reported multiple linear regression results for all outcomes for consistency and simplicity. We performed all of our analysis with STATA, version 9 [15].

Data Sources

Our analysis relied on several data sources. The first data source was an initial referral form through which expectant mothers requested services from the BFI. This form contains the name of the mother and infant due date, contact address, WIC identification number, previous breastfeeding experience, whether the expectant mother was subsequently contacted, and, in some cases, race/ethnicity. Although the BFI program operated in about 20 Michigan counties, referral forms were sufficiently completed and retained in only five counties: Lenawee, Monroe, Newaygo, Sanilac, and Wayne. The second data source was forms completed by PCs for all women who eventually participated in the BFI program. These forms were completed at program enrollment, infant birth, and program exit and included WIC identification number, name of mother, mother’s birth date and address, and infant’s name and birth date. The final data source was state administrative data contained in the Michigan Department of Community Health Data Warehouse, including data from WIC administrative records, Medicaid administrative records, and Vital Records.

The two BFI data sources, the referral forms and the program forms, provided us information about who belonged in the treatment and control groups. In addition, the identifying information on the forms was used to match these women to the state administrative data. For the treatment group, we first matched on WIC ID and infant date of birth and progressed through other identifiers such as mother and infant last name, county of residence, and infant first name (infant last name was not always recorded and may differ from the mother’s). Because only referral form information was available for women in the control group, the matching algorithms focused on WIC ID, mother’s last and first name, and mother’s due date. Matches of BFI treatment and controls with state data were 78.3% and 68.0%, respectively. The lower match rate of controls was to be expected given that less identifying information was available for them.

Once the data were matched, we obtained breastfeeding information, household income, gestational age, and head circumference from WIC administrative data. Our key dependent variables on breastfeeding were constructed from the WIC administrative data. We obtained race/ethnicity information from the Medicaid data. We obtained various pregnancy and birth characteristics from Vital Records (e.g., Apgar, tobacco use, adequacy of prenatal care, birth weight, etc.). The information in these latter two data sets allowed us to assess the validity of our estimation strategy and to adjust our findings for pre-programmatic group differences.

Our analysis sample contains all women who were successfully matched to the state administrative data and for whom the WIC administrative data contained breastfeeding information. For women who were missing other data elements, we defined an indicator variable for each data element and included this indicator variable in our regression models. This analysis strategy allowed us to retain the observations in our regressions and allowed the observations with missing data to be systematically different than the observations without missing data. We provide sample size information in the Appendix table to make clear the extent to which data elements were missing.

Results

Verifying the Quasi-Experimental Estimation Strategy

Our key identifying assumption was that the provision of services among those who requested service was independent of underlying breastfeeding propensities within each county. We examined whether the data were consistent with this assumption by comparing various pre-program characteristics of the treatment group and the control group. These comparisons are provided in Table 1. To assess whether there were statistical differences between the treatment and control groups within a county, we estimated a linear regression for each of the listed characteristics in which we included county indicator variables; the P-values in the final column of Table 1 are from these regressions.

Table 1 Comparing pre-program characteristics for treatment and control groups

We divided the pre-program characteristics into three categories: background characteristics of the mother, pregnancy characteristics, and birth characteristics. With respect to the background characteristics, there are significant differences only between the treatment and control groups based on whether the mother had a prior pregnancy. The participants in the treatment group were about 10% less likely to have had a prior pregnancy (49.7% vs. 61.4%; P = .001). There is weaker evidence of differences for whether the mother is Hispanic (treatment 6.5% vs. control 5.5%; P = .068). There are no other significant differences at the .10 level for the other background characteristics: mother’s race, mother’s age, household monthly income, whether there was a prior pregnancy within 18 months, and whether the mother had any previous breastfeeding experience.

We also examined whether there were differences in several pregnancy characteristics (tobacco use during pregnancy, drinks per week during pregnancy, early prenatal care, and adequate prenatal care) and birth characteristics (whether infant was female, birth weight, gestational age, head circumference, Apgar score, and whether the infant was admitted to the neo-natal intensive care unit (NICU)). There is evidence that the treatment group mothers were less likely to smoke during pregnancy (23.1% vs. 24.9%; P = .043) and weaker evidence that the treatment group’s infants weighed more (3291.7 g vs. 3259.3 g, P = .070) and had higher Apgar scores (9.04 vs. 8.95; P = .100). For all of the other characteristics, the characteristics are statistically indistinguishable between the treatment and control groups.

We consider these results largely supportive of our study design for several reasons. First, the strongest difference we observed was for whether there was a prior pregnancy, with the treatment group being less likely to have had a prior pregnancy than the control group. This characteristic was one of the few characteristics that could be identified from the referral form. We incorporated the possibility that counselors systematically chose whom to call back based on these characteristics, as well as other observable characteristics, in our analysis below. Second, at the standard significance level of .05, there were differences between the treatment and control group on only one other characteristic (tobacco use during pregnancy). We expected to find pre-program differences on a characteristic or two given the number of characteristics we examined.

The Effect of the BFI on Breastfeeding

We present our results of the effects of the BFI on breastfeeding in Table 2 and Fig. 1. Our results indicate that the BFI was very effective in increasing breastfeeding among the treatment group. To provide an initial indication of the effectiveness of the program, we present the unadjusted breastfeeding duration and rates for both groups. Mean total duration for the treatment group was 7.8 weeks, whereas the similar duration was only 5.7 weeks in the control group. As is made clear by examining breastfeeding duration at different time points (see Table 2 and Fig. 1), the longer duration was due to the treatment group being more likely to initiate breastfeeding and then continuing at 3 and 6 months.

Table 2 Comparing breastfeeding outcomes for the treatment and control groups
Fig. 1
figure 1

Percent breastfeeding over time by group

We assessed the statistical significance of our results by estimating multiple linear regression models. The first regression (regressor set 1) includes only county indicator variables, allowing us to isolate within county treatment-control differences. The results suggest that the treatment group breastfed 2.6 weeks longer than the control group and is strongly statistically significant (P < .001). We also examined the breastfeeding differences for the duration at various time points. The results suggest that the treatment group was 22.3% more likely to initiate breastfeeding (P < .001), 9.0% more likely to breastfeed at 3 months (P = .002), and 6.2% more likely to breastfeed at 6 months (P = .008). The treatment-control differences at the other duration time points (9 and 12 months) were not statistically significant at the 10% level.

We estimated two additional sets of treatment-control differences to further probe the validity of our quasi-experimental strategy. One additional set of treatment-control differences was estimated by including those characteristics that were potentially observable by peer counselors in the BFI program, including race/ethnicity of the mother, age of the mother, whether the mother had a prior pregnancy, and whether the mother previously breastfed a child; this set of regressors is referred to as regressor set 2. The second additional set of treatment-control differences was estimated by including all pre-program characteristics that were listed in Table 1; this larger set of regressors is referred to as regressor set 3. Results for these two additional regressor sets are also presented in Table 2.

There were small systematic changes when we compared the estimated differences with regressor set 1 to those with regressor set 2, but basically no change when we compared the estimated differences with regressor set 2 to regressor set 3. For example, the estimated difference in total weeks of breastfeeding increased from 2.62 weeks with regressor set 1 to 3.49 weeks with regressor set 2. This large estimated difference with regressor set 2 is still significantly different from zero (P < .001) and remains within the 95% confidence interval of the estimated difference with regressor set 1. When we included the exhaustive set of pre-programmatic characteristics as regressor set 3, the estimated difference increased slightly to 3.61 weeks. A similar pattern is observed when comparing the results for the other breastfeeding outcomes.

These findings are consistent with their being some systematic contact of referred women based on the information contained on the referral form, but then no systematic contact based on the numerous other pre-program characteristics contained in our data but that were not observable by peer counselors. These results provide further evidence of the validity of our key identifying assumption about the process of PCs contacting referred women. It is worth noting the nature of the systematic contacts implies that counselors contacted women who were less likely to breastfeed, implying that the systematic recruitment makes the program look less beneficial than it actually was. Based on this interpretation, our preferred set of estimates is those with regressor set 3 and we interpret the estimates to be the causal effect of the BFI program on breastfeeding behavior.

Discussion

We examined the effectiveness of a peer counseling breastfeeding support program for low income women in Michigan who also participated in WIC. Using a quasi-experimental methodology that stems from the program having excess demand for its services and data derived from administrative and survey-based sources, we estimated the causal effect of the support program on several breastfeeding outcomes.

We first presented results to examine the validity of our key identifying assumptions. Specifically, we compared the treatment and control group along a range of pre-programmatic characteristics. Although there is evidence of the systematic contact of referrals on some of the characteristics on the referral form (whether the mother had a prior pregnancy most notably), there is little evidence of systematic recruitment based on characteristics that were not known to the BFI program at the time of recruitment. These results supported our assumption that the process of contacts could be used as a quasi-experiment to identify the causal effect of the BFI program.

We then estimated the causal effect of the peer counseling program on breastfeeding outcomes. Our preferred estimates, which take into account the possibility of systematic recruitment on the characteristics that were observable by peer counselors and other pre-program characteristics to adjust for any remaining differences, indicated that the program was very effective at increasing breastfeeding among women in the treatment group. We found that the support program led to 3.6 additional weeks of breastfeeding for the treatment group, a very large effect when compared to the 5.7 weeks of average breastfeeding among the control group. Our results also indicated that this longer duration was due to more breastfeeding in the treatment group initially and at 3 and 6 months.

Our findings of programmatic effects through 6 months were more sustained when compared to the only U.S.-based study for low income women that used a true experimental design [13]. The previous study, which only found significant programmatic effects initially and at 3 months, reported the support program as understaffed, with less than 10% of women in their treatment group reporting a peer counselor contact past 1 month postpartum. By contrast, the BFI program requires monthly peer counselor contacts for all participants until they exit the program. Continued support past the initiation of breastfeeding may be critical for extending breastfeeding duration as women encounter challenges such as returning to work and the issues of breastfeeding older infants, such as the introduction of solid foods and teething.

There are several limitations to our study. The first rests with the validity of our key assumption regarding how women were contacted. Although we found little evidence that was inconsistent with our key assumption, the assumption itself cannot be tested. Thus, a true randomized control trial of such a PC program would be useful to corroborate the results we report here. The second and more important limitation rests with the external validity of our findings. Strictly speaking, even if our key assumption is valid, our study has identified the average program effect for women who requested PC services. It may be that women who request service make better use of the assistance provided by PCs, and thus, the average effectiveness would be larger for women in our study than it would be for the more general population of low income women.

Many studies have documented the low breastfeeding rates among low income mothers, a population whose children are at relatively high risk for poor health outcomes and who often receive government-supported medical care through Medicaid. Given the substantial evidence that breastfeeding imparts health benefits to both the child and mother, a program that increases breastfeeding among low income women could improve the health of an important, vulnerable population and generate large cost-savings for the Medicaid program. Our estimates suggest that the benefits of the BFI program could be substantial: it increased the breastfeeding initiation rate by about 27 percentage points, increased breastfeeding duration by 63% (or more than three weeks), and had lasting effects on breastfeeding rates through the sixth month. Moreover, such PC support programs are relatively inexpensive to administer because of their reliance on peer counselors rather than health care professionals. In light of these encouraging results, the BFI program should be subjected to a rigorous cost-benefit analysis to establish its cost-effectiveness and evaluated in other settings to establish whether its substantial effects are replicable.