Background

In the last two decades, prenatal and early infancy home visitation programs targeting underserved and high-risk families have flourished, in large part due to an emerging evidence base for program impact and cost-effectiveness. Although the strength of evidence differs by program model, maternal and child home visitation by trained professionals has demonstrated beneficial short term outcomes related to pregnancy and birth, child health and development, and early childhood injury [15]. Outcomes have also been sustained, with evidence of improved parenting skills and attachment, improved early educational and behavioral outcomes among children, reductions in parental and child arrests, and improved parental economic self-sufficiency [1, 611].

Accompanying a growing evidence base has come strong support for home visitation. Recently, the Patient Protection and Affordable Care Act (P.L. 111-148) of 2010 appropriated up to $1.5 billion dollars to expand evidence-based prenatal and early childhood home visitation programs through 2015. Even before this investment, combined public and private annual investment in these services had been estimated at between $750 million and $1 billion. This funding supports services for an estimated 400,000–500,000 families, or about 3% of all families with children under 6 years of age [12].

A national leader in the expansion of home visitation is the Nurse-Family Partnership (NFP), a program of weekly prenatal, infant, and toddler home visitation delivered by nurses to high-risk, first-time mothers. The NFP program is grounded in more than 30 years of evidence from randomized controlled trials designed to study the effects of the NFP model on maternal and child health, as well as child development. Results of the trials have shown replicated program effects on a number of maternal and child outcomes, including maternal smoking cessation, increased birth intervals between first and second pregnancies, and a reduction in early childhood injuries [25]. The program has also demonstrated long-term outcomes of reduced maternal welfare receipt [6, 9, 10], and reduction in antisocial behaviors among adolescents born to program recipients many years later [7, 8].

The rapid growth of NFP and other home visitation programs has brought the challenge of maintaining effectiveness as programs disseminate to local populations and communities not well represented in the original trials. As empirically few data exist to substantiate the continued success of home visitation programs after dissemination, this issue underscores the significance of rigorous program evaluation following replication. A recent evaluation of the NFP program over a seven-year period following statewide dissemination in Pennsylvania reported reductions in short birth intervals for second pregnancies that significantly improved over time following implementation. Furthermore, the evaluation found significant variation in program effect across sites; some sites, particularly rural locations, surpassed earlier trial outcomes, and others struggled to meet the same benchmarks [13].

This study sought to build upon this earlier work, by examining a second program outcome, childhood injuries in the first 2 years of life. The NFP program includes childhood injury prevention and maltreatment education implemented following a scheduled curriculum that begins in prenatal visits and continues through toddlerhood visits. Additionally, nurses are encouraged to implement injury and maltreatment education following any assessment that determines need. The goal of this study was to assess program effectiveness for reducing early childhood injuries following statewide implementation, and to examine the nature of program variation across sites throughout the state.

Methods

The primary sources of data covering a service period of January 1, 2003 to December 31, 2009 were: (1) the enrollment history of clients participating in 24 NFP programs throughout Pennsylvania; (2) birth certificate files from the Pennsylvania Department of Public Health; (3) death certificate files from the Pennsylvania Department of Public Health; (4) welfare eligibility files from the Department of Public Welfare; and (5) Medicaid claims data from the Department of Public Welfare.

The target population were clients from the 24 NFP sites in Pennsylvania who were enrolled in the NFP program between January 1, 2003 and December 31, 2007. Included were women who: (1) delivered a first-born infant who was not medically high-risk (see below for definition of medically high-risk); (2) were successfully linked to the Medicaid claims of their child following birth; and (3) received welfare assistance from the Commonwealth of Pennsylvania within 12 months prior to the birth of their first-born infant. The NFP study sample used in this analysis represents a modification of a study sample used in a previous analysis of second pregnancy spacing [13]; clients included in this sampling frame were subject to additional exclusionary criteria and were drawn from a sample that included two additional years of enrollment (2006–2007).

To obtain an unbiased sample of children with potential exposure to the injury outcome during the first 2 years of life, medically high-risk infants were excluded prior to selection of the unexposed comparison group. The study team defined medically high-risk as infants meeting any of three criteria: infants born prior to 25 weeks gestation; infants who died at birth or within the first 14 days of life; and infants whose death resulted from a congenital or perinatal condition. Exclusionary congenital and perinatal conditions from death certificates were defined as the following: ICD-9 cause of death listed as congenital anomaly or chromosomal syndrome (740–759); fetal affects of maternal conditions, pregnancy complications or labor and delivery complications (760–773); fetal effects originating in the perinatal period, such as malnutrition, hemorrhage, respiratory distress syndrome, and jaundice (764–779.9); and ICD-10 cause of death listed as extreme prematurity (P072).

Eligible women for an unexposed comparison group were identified following a previously described linkage to birth certificate and welfare eligibility data from women residing in NFP communities who were also expecting a first-born infant, received welfare assistance in the year prior to their infant’s birth, and met the exclusion criteria noted above [13]. To identify a comparison group from among the unexposed eligible women, a propensity score analysis used data from birth certificates and welfare eligibility files to model factors associated with a woman’s participation in NFP. The factors included maternal education (<12th, high school, some college or higher), maternal race (White, Black, Hispanic, Other), marital status (y/n), prior history of smoking before and/or during first trimester of pregnancy (y/n,), TANF receipt prior and/or during first trimester of pregnancy (y/n), foodstamp receipt prior and/or during first trimester of pregnancy (y/n), and history of gestational diabetes (y/n). In addition, variables were included that encoded high density zipcodes within each agency catchment area in order to drive the selection of unexposed comparison women toward high-penetration neighborhoods of interest. This density variable was created by identifying zipcodes as high density that enrolled more than 5% of the NFP client population within the agency catchment area. Finally, models were stratified on maternal age (≤18 years, >18 years) and time period of birth cohort (2003–2005, 2006–2007) to force balancing on these factors for subsequent stratified analyses.

Using a separate logistic regression for each agency, the expected probability of participation in NFP was then determined based on the above characteristics for each woman within an agency [14, 15]. To achieve matching, the expected probability of participation in NFP (propensity score) was estimated for each woman who participated in NFP. The next step excluded as potential matches, all unexposed comparison women who had propensity scores that fell outside the range of propensity scores of NFP clients. This initial exclusion left a group of unexposed comparison women and a group of NFP clients who shared propensity scores with “common support”—or overlapping ranges of propensity scores. Using a caliper of 0.05, one or more unexposed comparison women were selected using a nearest neighbor match without resampling (up to a maximum of 4 matched comparison women per client). Matching was done with a program called “%gmatch macro” under the SAS® Statistical Package v9.1.3 [16]. To avoid bias in differential matching rates across clients, analysis weights were assigned to comparison women based on the number of unexposed women matched to the NFP client.

The primary outcome was a count of injury episodes to children of NFP clients and matched comparison women in the first 2 years of life. Birth certificate data and welfare eligibility files permitted a linkage to Medicaid claims files to facilitate the identification of injury claims from emergency department visits and hospitalizations on infants in the first 2 years of life. Outpatient procedure codes were restricted to emergency department visit and emergency department supplemental service codes (99281-99285, W9029, W9045, W9047, W9048). Consistent with prior pediatric injury studies [17], ICD-9-CM diagnosis codes were identified as potential injuries if they were in the range of 800–909.2, 909.4, 909.9, 910–994.9, and 995.5–995.59. Excluded were codes that indicated late effect complications, complications due to medical care, adverse reactions, and systemic inflammatory response. Deaths due to injury in the first 2 years of life were encoded as a hospitalization for injury under the assumption that a hospitalization for the fatal injury would have occurred had the child survived.

To avoid over-counting injury events due to follow-up visits in the emergency department, episodes of injury were identified for each child with multiple outpatient injury claims. Mirroring prior studies [18, 19], a sequence of claims with the same diagnosis or clinically relevant diagnosis occurring within a 180-day interval was classified as a single injury episode. Multiple injuries within the 180 day interval that were not similar were classified as unique injury episodes. For inpatient claims, an episode of injury encompassed all claims occurring within the duration of a continuous hospital stay. Inpatient claims with admission dates adjacent to a previous discharge date were similarly reviewed (D.R., M.M.); identical or clinically relevant diagnoses occurring in adjacent inpatient claims were classified as one injury episode.

In addition to the covariates that were included in the propensity score analysis, a proxy for health seeking behavior and health care access was identified by counting the number of non-injury emergency department visits for children of women included in the study. Non-injury visits were identified as claims with outpatient procedure codes restricted to emergency department visit and emergency department supplemental service codes (99281-99285, W9029, W9045, W9047, W9048), excluding claims with ICD-9-CM codes used in the identification of injuries (800–909.2, 909.4, 909.9, 910–994.9, and 995.5–995.59). Each claim with a distinct diagnosis was classified as an episode. A sequence of claims with the same diagnosis occurring within a 14-day interval was classified as a single episode.

Analysis

Data were presented using means (and standard deviations) for continuous variables, and frequencies for categorical variables. Generalized linear models with a Poisson distribution examined the association between episode counts and NFP participation, while stratifying by agency catchment area in a fixed effects analysis. Models included only the exposure status and the covariate of non-injury emergency department visits to account for potential confounding by differences in health care utilization overall by clients and comparison women. We estimated and report robust variance estimates to account for the potential of lack of model fit owing to possible overdispersion of the data. Results were expressed as an incident rate of injury episodes between clients and comparison women, using an exposure time that was set at 2 years for each child, unless an untimely death in the first 2 years shortened that interval. Because of concern that Medicaid eligibility might have been differential between the groups and therefore biased results, a separate sensitivity analysis on the 2003–2005 cohort (in which complete eligibility data through 2007 was available) was conducted in which length of eligibility over the first 2 years signified the exposure time; such an analysis, however, did not change the results of the study and is therefore not reported below. Subgroup analyses by agencies also permitted the estimation of incident rate ratios across agencies in the Commonwealth as means to understanding variation across implementation sites.

Analyses were conducted using Stata versions 11.0 (College Station, TX) and SAS v9.2 (SAS Institute, Cary, NC). Approval for the study was granted by the Department of Public Welfare for the Commonwealth of Pennsylvania and the Institutional Review Board at the Children’s Hospital of Philadelphia.

Results

From 2003 to 2007, 24 sites enrolled 7,276 clients. Among the 5,909 welfare-eligible clients whose first singleton infants were identified from Pennsylvania birth certificates, 94% were matched to their child’s Medicaid records. Local-area propensity score matching identified unexposed comparison women for 90% of this restricted sample (5,016 women), with a final yield of 16,704 comparison women for this study (Fig. 1).

Fig. 1
figure 1

Identification of Nurse-Family Partnership clients and local-area matched comparison women using Pennsylvania birth certificate data, welfare eligibility data and Medicaid data between January 1, 2003 and December 31, 2007

In aggregate, women in the study cohort were more likely to be white, unmarried, and from urban areas of the state; 42% of the women were ≤18 years of age (Table 1). Propensity score matching largely balanced on all selection factors, as prior to matching, NFP clients were more likely than other welfare-eligible women in the state to be young, unmarried, less educated, and have a history of smoking.

Table 1 Characteristics of Nurse-Family Partnership clients compared with all potential comparison women, welfare-eligible comparison women, and final matched comparison women across the Commonwealth of Pennsylvania

In total, 6,129 injury visits were identified among children of NFP clients and unexposed comparison women. Of these, 1,613 were to children of NFP clients. The distribution of the frequency of injury visits per child ranged from 0 to 13. Children of NFP clients were more likely than comparison children to have at least 1 injury visit (32% vs. 27%), but were less likely to have 5 or more injury visits (0.1% vs. 1.0%, Table 2).

Table 2 Distribution of number of visits per child

The children of NFP clients were more likely in aggregate to have higher rates of injury visits in the first 2 years of life than the children of comparison women (415.2/1,000 vs. 364.2/1,000, P < 0.0001, Table 3). This difference persisted despite significant increased emergency department utilization overall by NFP clients for their children (Visit rate: 3.0 per child for NFP clients vs. 2.7 per child for comparison women, P < 0.0001). Significantly higher rates of visits among children of NFP clients for superficial injuries (e.g. abrasions, bruises, lacerations) (156.6/1,000 vs. 132.6/1,000, P < 0.0001) compared to the comparison children largely explained the difference in overall injury visits. Barring a small absolute difference in hospitalization rates (15.3/1,000 for NFP vs. 11.4/1,000 for comparison children, P < 0.038) and motor vehicle accidents (4.5/1,000 vs. 1.9/1,000, P < 0.006), visit rates for injuries of increasing severity and suspicion for child abuse [20] were similar between groups.

Table 3 Rates of incidence of injury episode visits

Significant variation in injury visit rates was identified across individual agencies (Fig. 2). The proportion of children with at least one injury visit varied from 14.5 to 42.5% among agencies. At the same time, there was also variation within agency catchment areas when clients were compared to their local community comparison populations. Figure 2 demonstrates the absolute risk differences between NFP clients and comparison women for each of the 24 agencies across three outcomes. The figure along the x-axis demonstrates the significant variation across agencies in client injury visit rates, while the y-axis reveals the variation in risk differences from the comparison populations. While agency variability in risk differences persisted across all three outcomes, in general, variation in risk differences between NFP clients and comparison women narrowed for more serious (less frequent) events; the absolute risk differences among agencies for total injuries ranged from −0.3 to 14.5%, while the absolute risk differences for hospitalization ranged from −0.6 to 2.4%.

Fig. 2
figure 2

Risk difference plots describing agency variation in injury visit rate differences between NFP clients and comparison women. Plots shown for a total injury visits; b head injury or fracture visits; c hospitalizations

Discussion

This propensity matched analysis following statewide implementation of the NFP was unable to find positive program effects on injury visits for children. Injury visit rates to emergency departments and hospitals in the first 2 years of life were higher overall among children of NFP clients compared to the children of locally matched unexposed comparison women. This finding was explained mostly by increased emergency department utilization for superficial injuries among children born to NFP clients. Visit rates for more serious injuries (e.g. fractures and head injuries), as well as injuries with suspicion of child abuse, were similar between children born to NFP clients and children born to comparison women.

Despite the propensity score matching methods employed in this analysis, selection bias remains a notable limitation. Therefore, the findings of this study should be interpreted cautiously, given the concern that selection differences between the NFP clients and the comparison group might persist in spite of control for measured differences between NFP mothers and the comparison group. The factors included in propensity score models were limited to those available through birth certificate and welfare eligibility data. The analysis was therefore subject to a selection bias between the NFP clients and their matched comparison women on factors such as mental illness, substance abuse, and family risk factors not included in the administrative data accessible to the study and on which referral agencies may have based their decisions to refer prospective participants to the NFP. Data from child welfare and behavioral health systems were not available for this study. Consequently, if such a difference existed between NFP clients and comparison women in some agencies, it may explain the higher rate of overall injuries among children born to clients, and potentially the lack of aggregate effect on more serious injuries. This alternative explanation is consistent with the observation that among eligible welfare births, the NFP enrollees were at higher risk than their non-NFP counterparts prior to statistical adjustment.

Such a concern, however, should not completely discount the results we report, as it also is possible that selection bias may operate in favor of the hypothesized benefits of the program on injuries, in that the very highest risk prospective clients may chose not to enroll in the NFP and may have been included in the control group. Despite a persistent concern of unobserved selection factors in observational studies, such studies arguably represent a reasonable option for evaluating programs after dissemination into community settings, as long as their limitations are fully understood. Unless community programs were to use lotteries or randomization as a matter of practice, the choices of suitable methods of evaluation are limited. To that degree, the approach used in this study made use of the best available methods to account for selection bias.

The majority of the injuries to the study cohort were superficial and of minor severity, resulting in emergency department visits; these superficial injuries also exhibited the greatest risk difference between NFP clients and the comparison group. The finding of increased emergency department visitation for superficial injuries among NFP clients is consistent with the observation shown in Table 3 that NFP parents sought emergency care for their children for non-injury related reasons. Enrollment in NFP is voluntary, therefore, the women who self-select enrollment in the program may be different from eligible women who do not enroll due to unobserved factors; it is plausible that women who self-select enrollment may exhibit increased health-seeking behavior resulting in more frequent encounters with the health care system. Alternatively, it is possible that higher utilization may be reflective of increased surveillance of families in the program by home visitation nurses or a curriculum from NFP that encourages and facilitates health care access. The study findings did not change appreciably with the inclusion of a proxy variable for health care utilization, however, suggesting that potential bias from health seeking behavior was likely small. Furthermore, although injury-related primary care visits were not included in this study, there is no a priori evidence that NFP families are less likely to seek care in the primary care setting compared to community peers. A limitation of this study is the inability to include primary care visits as an alternate source of health care utilization from which to assess differences in health seeking patterns between NFP clients and matched comparison women.

To some degree, the findings of increased utilization for superficial injuries by NFP clients and no aggregate differences between the two groups in higher severity injuries may reflect a challenge of when the injury curriculum is provided in relationship to injury events that were observed. This study is unable to examine those temporal relationships. However, the NFP injury components of the curriculum are implemented throughout the entirety of the program, beginning prenatally, so unless the injury curriculum was not fully implemented, it is unlikely that no exposure to the curriculum would have occurred. In addition, whether participant retention beyond birth, which has been a challenge for NFP and other home visitation programs, has eroded benefit during implementation warrants further investigation. Finally, concerns raised by state agencies around child supervision by non-maternal caregivers may warrant investigation into lack of appropriate childcare among mothers returning to school and work following pregnancy.

Ultimately, interpreting these data on childhood injuries, which suggest variation in possible program impact across agencies, illustrates a central challenge for home visitation programs, namely, the difficulty of achieving program model fidelity when there are contextual challenges at the local level that can act as barriers to successful implementation. While NFP employs a formalized and well-resourced protocol for implementation support at the site-level, inclusive of two supervisory staff, annual regional and state meetings, and continuing education, standardized evaluation of on-going model fidelity has not yet been incorporated into implementation protocols. Consequently, to date, rigorous data on site-level fidelity to support quality improvement efforts has not been published. Coupling these data with a more qualitative review of the quality of program delivery and local challenges with implementation may reveal best practices that have led to positive outcomes in several agencies and barriers that have led to poor outcomes in others. Such a qualitative analysis might uncover approaches to quality improvement that seek to mitigate local contextual barriers to achieving fidelity.