Interorganizational collaboration has become a popular strategy for addressing a wide range of problems, from teen pregnancy to cardiovascular disease. Theoretically, interorganizational collaboration is a promising strategy due to its ability to foster relationships among individuals and organizations which had previously worked in parallel, to combine diverse perspectives on an issue and solutions, to devise comprehensive interventions targeted at multiple levels of the community, and to create strength in numbers in order to leverage resources and exert influence on external decision makers, among many other theoretical advantages (Butterfoss and Kegler 2002). In many instances, the purpose of collaboration is to create change in population-level health or behavior outcomes. However, evidence of effectiveness in doing so is lacking (Hallfors et al. 2002; Klerman et al. 2005; Kreuter et al. 2000; Roussos and Fawcett 2000), at least in part due to substantial methodological challenges inherent to this research. These challenges include the smaller size of samples of communities, the feasibility of random assignment of communities to experimental conditions, the relatively long study duration that may be required to identify slow developing changes at the community level, and the causal complexity that must be accounted for in analyses attempting to link collaboration to macro-level outcomes (Berkowitz 2001; Gabriel 2000; Roussos and Fawcett 2000; Yin and Kaftarian 1997). This study examines effects of interorganizational collaboration on a community-level indicator of child and family well-being, the rate of low weight births in Georgia counties. We illustrate a methodological approach that addresses a number of the challenges posed by community-level research, and present results that suggest a relationship between collaboration and improvements in population rates of low weight births.

The community collaboratives in this study are part of Georgia Family Connection (FC), a statewide network of community collaboratives working to improve child and family well-being. FC collaboratives are very similar in nature to other forms of interorganizational collaboration that have been variously referred to as coalitions, partnerships, networks, etc. (e.g., Butterfoss 2007; Granner and Sharpe 2004; Roussos and Fawcett 2000; Zakocs and Edwards 2006). There is not a consensus definition of interorganizational collaboration, but the various definitions commonly identify collaboration as the interaction of diverse parties in the pursuit of a common goal (see for example: Himmelman 1996; Mattesich et al. 2001; Taylor-Powell et al. 1998; Wood and Gray 1991). In the present study, the term community collaboration is used to refer to a version of interorganizational collaboration distinguished by the active participation of private citizens, in contrast to membership that is determined solely by professional affiliation (Butterfoss 2007). As Bailey and Koney (1996) note, community involvement ensures that collaborative activities are germane to community needs and increases the likelihood that collaborative efforts can be sustained beyond formal organizational support.

Georgia initiated its long-term commitment to community collaboration as a strategy to improve the well-being of children and families in 1991 when 15 counties received funding to pilot the FC community collaborative approach. Over time the FC network added new counties so that by 2004 there were 157 collaboratives serving every county in the state. FC collaboratives consist of representatives from public agencies, businesses, faith-based organizations, elected officials, and local citizens. FC collaboratives serve as local decision-making bodies for their communities to improve targeted indicators of child and family well-being such as low infant birthweight, teen pregnancy, high school graduation, and unemployment. Through a community assessment each collaborative identifies the issues of relevance to their community. The FC community assessment is a strengths and needs assessment conducted by members of the collaborative, occasionally with the assistance of a hired contractor. Methods include primary data collection through focus groups, key informant interviews, and surveys, as well as secondary data sources such as Kids Count (Annie E. Casey Foundation 2009a), the US Census, and a variety of other state agency databases. Through the community assessment, collaboratives identify one or more issues, and then develop and implement strategies to address them. In this study we focus on the subset of collaboratives addressing low infant birthweight (LBW), one of the outcomes most commonly targeted by FC collaboratives.

Low infant birthweight is a strong predictor of infant mortality and is linked to a variety of subsequent health problems, such as cerebral palsy, heart and intestinal problems, asthma, and blindness as well as cognitive and social deficits such as lower IQ scores, lower educational attainment, and hyperactivity (Eichenwald and Stark 2008; Hack et al. 1995). Another reason LBW is commonly targeted by FC collaboratives is that it is largely preventable by basic prenatal care. The likelihood of women giving birth to LBW infants can be greatly reduced if mothers maintain a healthy weight, keep up-to-date on vaccinations to prevent infections, take prenatal vitamins, do not smoke, and effectively manage chronic conditions such as diabetes and hypertension (Annie E. Casey Foundation 2009b; Georgia Family Connection Partnership 2009).

FC collaboratives devise strategies that are defined as clusters of related programs, services, and activities designed and supported by a collaborative to achieve desired benchmarks for children, families and communities. As an example of an LBW strategy, several FC counties implemented an early childhood strategy that focused on supporting families with children birth to age four. This strategy included five primary services: universal contact at birth, intensive home visitation, developmental childcare, parenting education, and adult education and job training. Contact at birth and intensive home visitation included core services to improve women’s health such as linkage to a medical home and health education. Referrals and linkages to prevention services during the interconception and prenatal periods were also made, including medical services, dental services, immunizations, healthy lifestyle classes and smoking cessation.

According to the FC theory of change, collaborative operations including engagement of members, governance, finance, planning, and evaluation are expected to affect low birthweight through both individual and community level pathways. Through the strategy, a collaborative may provide (or facilitate or act as a catalyst for) individual-level services (e.g., home visiting), it may facilitate changes in the system of services such that the array of services is more comprehensive and integrated (e.g., bringing home visiting providers together to reduce duplication), and it may implement activities targeted directly at the community level (e.g., by holding a legislative breakfast on LBW or running a public awareness campaign).

It is important to note that the FC theory of change does not limit the development of the strategy to any particular model of intervention. This contrasts with other models of collaboration that are more prescriptive of the method of intervention. Most prescriptive would be community trials which aim for uniform implementation of a single researcher-selected intervention model (see for example: Community Intervention Trial for Smoking Cessation: COMMIT Research Group 1995; Minnesota Heart Health Program: Luepker et al. 1994). As a contrasting example, the Communities That Care model (CTC; Hawkins et al. 2008, 2009) strikes a balance between uniformity and flexibility by specifying that communities use a prevention science approach to identify risk and protective factors for targeted problems and choose from a menu of evidence-supported interventions available to address those factors. The model of collaboration in the present study is even less prescriptive in that FC collaboratives are not limited to the implementation of evidence-supported interventions. This naturally leads to much larger variation in the interventions implemented by FC collaboratives. We raise this distinction because it has implications for the interpretation of differences in outcomes between intervention and comparison communities.

In the present study, there is presumably such large variation within both intervention and comparison groups in terms of intervention methods (i.e., strategies) that the primary difference between intervention and comparison groups is the FC model of collaboration itself. In contrast, in a study of a model of collaboration more prescriptive in terms of intervention methods, such as the CTC model, which implements a limited set of evidence-supported interventions, intervention communities share both the model of collaboration and a more homogeneous method of intervention. In the case of CTC, studies have shown that intervention communities do in fact have higher levels of implementation of evidence-supported interventions than comparison communities (Fagan et al. 2011, 2012).

In the present study, the only consistent difference between intervention and comparison communities is the FC model of collaboration. We assume that intervention communities vary in terms of specific interventions employed to affect individual recipients of services and communities, and also that many comparison counties implemented interventions to address LBW, but the latter did so without the benefit of FC collaboratives. Thus our analyses are a test of whether FC collaboration is associated with improvements in LBW above and beyond effects of conventional services. We test a simple hypothesis, that the rate of change over time in county rates of low infant birthweight will be more favorable where there are FC collaboratives targeting that outcome, than in comparable counties in other southeastern states without FC collaboratives.

Method

Sample

Collaboratives

At the end of each fiscal year since 1997, leaders from FC collaboratives have completed a comprehensive self-assessment (different from the community assessment described earlier) that collects data on a variety of collaborative operations, including the specific outcomes targeted for the year. Targeting means that the collaborative developed a strategy to address a particular outcome. This study is focused on a subgroup of 25 counties with collaboratives that targeted LBW for a sum of at least 2 years from 1997 to 2003. This definition of the intervention offered the best combination of dosage and group size. Because targeting is measured annually, various patterns of targeting are possible over this 7-year period. The average sum of years targeting LBW during this period was 3.4 (SD = 1.5, Min = 2, Max = 7) and the average number of consecutive years targeting was 2.6 (SD = 1.6, Min = 0, Max = 7). Collaboratives may also target additional indicators. Other indicators commonly targeted by these 25 collaboratives included teen pregnancy (83 %), child abuse and neglect (73 %), and school absenteeism (71 %). Counties with collaboratives that did not identify LBW as a priority issue for 2 or more years during this time period were not included in this study so that our comparison was between collaboration targeting LBW and a ‘no collaboration’ comparison condition.

Just as studies of individuals report demographic characteristics of participants, we present the following characteristics of our sample of collaboratives. Based on 2003 self-assessment data, the most recent year included in the operational definition of the intervention, the average age of collaboratives was 6.8 years (SD = 2.6, Min = 2, Max = 10). The average tenure of the collaborative coordinator, the person responsible for managing day-to-day operations of the collaborative, was 3.9 years (SD = 2.7, Min = 0.1, Max = 10). The average tenure of the chair of the governing board was 1.8 years (SD = 1.4, Min = 0, Max = 6). In terms of organizational formalization, 19 of the 25 collaboratives had membership requirements (76 %), 22 had by-laws (88 %), 21 had a committee structure on the governing board (84 %), and 24 had written member roles & responsibilities (96 %). Regarding the composition of membership, 15 of 25 collaboratives included consumers of services (60 %), 12 included youth (48 %), and 22 (88 %) included business and local government, respectively.

Comparison Counties

In order to test the effectiveness of collaboratives in reducing LBW rates, the 25 FC counties were statistically matched with comparison counties from Arkansas (n = 75), Mississippi (n = 82), Tennessee (n = 95), and Virginia (n = 135) using propensity score methods. These states were chosen because of their geographic proximity to Georgia and the availability of county-level LBW data for all 8-year between 1997 and 2004 that shared a common definition of LBW (number of babies born weighing less than 5 lbs, 8 oz, divided by total number of live births).

Measures

Matching Variables

The propensity score model variables listed in Table 1 were drawn from census estimates (US Census 1990, 2000). All variables represent 2000 census variables with the exception of the variables reflecting changes between 1990 and 2000 values (i.e., changes in total population, the percentage of children under 18, the percentage of Black children under 18, and the percentage of Hispanic children under 18), which were created to account for population shifts over time. The composite socioeconomic status (SES) variable was drawn from a compilation of economic indicators from the 2000 census provided by the RAND Corporation (2009). Indicators of socioeconomic status included: (1) percent of adults older than 25 with less than a high school education, (2) percent of adult males unemployed, (3) percent of households with income below the poverty line, (4) percent of households receiving public assistance, (5) percent of households with children that are headed by a female, (6) and median household income. The composite SES variable was normalized on a scale from 0 to 100 on which higher scores indicate higher SES.

Table 1 Means for matching variables among FC and comparison counties, before and after matching

Low Birthweight

Low birthweight data from 387 comparison counties and 25 FC counties was obtained from the Kids Count Data Center (Annie E. Casey Foundation 2009a) for the years 1997 through 2004. In each year the LBW rate was defined as the total number of babies born weighing less than 2,500 g (or 5 lbs, 8 oz), divided by the number of live births in the county, multiplied by 100. The use of archival community indicators as outcomes has been recommended previously as a promising strategy for outcome evaluations of community-level interventions (Gabriel 1997).

Analysis Procedure

Propensity scores were created by including 11 observed county characteristics in a logistic regression equation (with comparison or FC county status as the dependent variable) to assign a single summary score (the propensity score) to each of the FC and comparison counties. The resulting propensity score represents the estimated probability that each county, regardless of whether it is an FC county or a comparison county, has of being a typical FC county (values range from 0 to 1, with 1 corresponding to a county with all the characteristics of an FC county). The 11 covariates used in this analysis were chosen because they were expected to be associated with the unique social, economic and demographic characteristics of FC counties, as well as predict differences in LBW between FC and comparison counties (Hillemeier et al. 2007; Roberts 1997). The complete list of covariates accounted for in the propensity score equation is shown in Table 1.

The propensity scores were then used to match FC counties to comparison counties using a full propensity score matching procedure. Based on propensity scores, full matching creates a series of matched sets that include at least one intervention county (FC) and at least one comparison county (Stuart 2010; Stuart and Green 2008). The cases in each set are then statistically weighted. Within each set, intervention (FC) cases are assigned a weight of 1 and comparison cases are weighted in proportion to the ratio of intervention to comparison cases in the set. Full matching, compared to classical nearest neighbor or other forms of 1:1 matching, utilizes all available comparison cases and provides an optimal balance of propensity scores between intervention and comparison cases (Rosenbaum 2002; Stuart and Green 2008; Stuart 2008, 2010; Hansen 2004). For more information on full matching procedures, see Stuart and Green (2008) and Stuart (2010). Matching was conducted using the MatchIt statistical package (Ho et al. 2007) within the R statistical software package v2.11 (R Development Core Team 2010). After matching was completed, outcome analysis consisted of latent growth modeling to determine whether change in LBW rates from 1997 to 2004 differed among FC counties compared to non-FC counties. Outcome analyses were conducted with Mplus statistical software v6.1 (Muthen and Muthen 1998–2010) using latent growth modeling techniques, with 1997 representing the intercept (coded as 0), and yearly LBW rates representing growth over time until 2004 (coded as 7).

Results

Matching

Table 1 presents descriptive statistics for the matching variables before and after the matching procedure. The similarity between FC and comparison counties was markedly improved after the matching procedure. Specifically, after matching, the standardized bias for six of the eleven covariates improved by more than 90 %, and all fell well below the 0.25 threshold, indicating well matched samples (Harder et al. 2010; Ho et al. 2007). The smallest improvement was observed for the SES composite variable but this mild improvement (2 %) was due to the fact that this variable was already well balanced prior to matching.

Latent Growth Model

The primary outcome of interest was change in low weight births over the period 1997 through 2004. A latent growth model was fit to determine whether FC counties had different linear rates of change in LBW rates over the 8 year study period than the comparison counties, controlling for the numerous community contextual characteristics in the propensity score model. A full information, robust maximum likelihood estimator was employed to obtain parameter estimates and standard errors that are robust to non-normality and missing data under the assumption that data were missing at random conditional on the covariates. Less than three percent of the cases (n = 11 comparison counties) were removed from the analyses due to missing data on all years, and 91 % of the remaining counties had no missing data (leaving 376 total comparison counties; 25 total FC counties). This model incorporated the weights from the full matching procedure and adjusted standard errors to account for the clustering of matched sets as well. The unconditional model for LBW rates from 1997 through 2004 specified the intercept at 1997 and included a linear factor to describe change over time. This model fit the data well [χ2(31) = 33.46, p = 0.349, CFI = 0.99, RMSEA = 0.01)] and described a slight increase in LBW rates over time (b = 0.14, p < 0.001, CI [0.10, 0.18]). The intercept (variance = 3.67, SE = 0.62, p < 0.001) and slope (variance = 0.01, SE = 0.01, p = 0.423) were positively correlated (r = 0.71). To further examine the amount of variance in the slope factor, we compared this model to one in which the slope variance was fixed to zero, and found a significant reduction in model fit [χ2(2) = 9.54, p = 0.008], indicating the variation in the slope was estimable. The inclusion of a quadratic factor did not significantly improve model fit [χ2(27) = 31.82, p = 0.24, CFI = 0.99, RMSEA = 0.02, Δχ2(4) = 1.28, p = 0.865]. Accordingly, effects of FC were tested in the linear model to determine whether change in LBW rates over time in FC counties differed from comparison counties.

The conditional model included two predictors: the dummy coded intervention variable (0/1) and the SES composite score to address the possibility that this variable was not sufficiently accounted for by the propensity score. The slope and intercept parameters were regressed on both predictors. This model also fit well [χ2(43) = 59.85, p = 0.045, CFI = 0.98, RMSEA = 0.03]. Findings from the conditional model indicated non-significant differences between FC and comparison counties in initial LBW status in 1997 (b = 0.63, SE = 0.37, p = 0.086, 95 % CI [−0.088, 1.353]), but significant differences in the rate of change in LBW rates over time (b = −0.13, SE = 0.06, p = 0.041; 95 % CI [−0.25, −0.01]. The rate of increase in LBW rates over time was significantly smaller among FC counties than among comparison counties. The standardized effect size for the difference in mean slopes (δ) was 1.48 (Raudenbush and Liu 2001), indicating that the rate of change in LBW in FC counties was approximately one and a half standard deviation units lower than in comparison counties. Applying the estimated growth rate for comparison counties to FC counties as a reflection of what would have happened without FC collaboration, these findings suggest that collaboration prevented approximately 50 low weight births across the 25 FC counties over the 8-year study period. Additionally, higher SES was associated with a lower intercept (b = −0.27, SE = 0.02, p < 0.001, 95 % CI [−0.31, −0.23]) and a smaller slope (b = 0.01, SE = 0.01, p = 0.004, 95 % CI [−0.02, −0.001]). The difference between FC and comparison counties is illustrated in Fig. 1, which displays the estimated LBW trajectories for both groups.

Fig. 1
figure 1

Model estimated rates of LBW among FC and comparison counties from 1997 to 2004

Discussion

Although community collaboration has become a popular strategy for addressing a range of different outcomes, reviews have found limited evidence for its effectiveness in improving population-level health and well-being (Hallfors et al. 2002; Klerman et al. 2005; Kreuter et al. 2000; Roussos and Fawcett 2000). In this study we have demonstrated a research approach that allows for examination of outcomes over a relatively long study period using a relatively large sample of communities, and we have produced evidence of a relationship between community collaboration and beneficial change in county rates of low birthweight. Linear change in county rates of low infant birthweight from 1997 to 2004 was modeled among 25 Georgia counties with Family Connection collaboratives targeting low infant birthweight, and a comparison group of 376 counties in other southeastern states statistically weighted for valid comparison. Although rates of low infant birthweight increased on average over the study period for both FC and comparison counties, the rate of increase for FC counties was significantly lower than in comparison counties, with propensity score matching and regression-based control for a variety of community contextual factors likely associated with rates of low infant birthweight and intervention group membership.

According to the Family Connection theory of change, this relationship between collaboration and county-level rates of LBW would result from changes at the individual level, via services and changes in the way services are delivered, and at the community level via changes in community awareness, policy, or funding, for example. These are the intermediate outcomes theorized to result from the FC model of collaboration which entails engaging a membership, governance, finance, planning of a strategy based on the community assessment, and evaluation to monitor progress. Due to the absence of data reflecting the nature of collaboration and services delivered in intervention and comparison counties, this study is purely theoretical in terms of the process by which an effect of collaboration would operate. We assume that comparison counties had a variety of conventional services targeting LBW, as did FC counties, so that the primary difference between groups is the value added by collaboration.

The primary threat to the internal validity of these findings is the unknown set of variables not accounted for in our propensity score model that are associated with group membership and LBW rates. In other words to the extent that we have not accounted for all of the factors associated with selection into the intervention group as opposed to comparison, the estimate of the effect of collaboration is biased in an unknown direction. Unfortunately, sensitivity analyses designed to detect this potential hidden bias (Becker and Caliendo 2007; Rosenbaum 2002) are currently unavailable using full matching and the MatchIt software program, an issue further complicated in the current investigation because our outcome is a latent slope as opposed to a point estimate such as a group mean. One of the more salient possibilities is that because all of the FC counties were located in Georgia, an unknown state-level variable not accounted for by our propensity score model could explain the difference observed between FC and comparison counties. For example, if Georgia initiated a new state policy affecting LBW during the study period, the observed difference between FC and comparison counties could be due to that policy. However we are aware of no such policy or other state initiative in Georgia, aside from Family Connection, that targeted LBW during this time period.

Another possible explanation for the observed difference in slopes between intervention and comparison counties is a difference in initial LBW levels (or correlation between change and initial status). If the amount of change in LBW is related to the initial level of LBW, and initial levels of LBW differ between intervention and comparison groups, these initial group differences could confound results. The estimated trajectories in Fig. 1 suggest that FC counties had higher initial LBW levels, but the model intercept, which represents the initial level, was not significantly related to the intervention group variable. In addition, the correlation between the intercept and slope was positive, suggesting that FC counties would tend to have larger increases in LBW were the difference in intercepts significant. Thus it is unlikely that the observed difference in slopes between FC and comparison counties can be explained by differences in initial LBW rates.

One of the greatest challenges of research at the community-level is the causal complexity that must be accounted for in linking collaboration to community-level outcomes (Berkowitz 2001). Doing so requires clarity in both conceptual and operational definitions of collaboration and intermediate outcomes that precede improvements in health and well-being. There is considerable inconsistency in conceptual definitions of collaboration as an overarching concept (Wood and Gray 1991; Thomson et al. 2007), not to mention conceptual and operational definitions of elements of collaboration such as functioning and intermediate outputs (Granner and Sharpe 2004). In this study we have attempted to clearly characterize both the collaboratives and their theory of change. Interpretation of this analysis would be aided by more complete observation of services provided to address LBW in both intervention and comparison counties as well as the extent of and quality of collaboration occurring in both groups. Such data would illuminate the differences between the groups in terms of intervention, as well as provide possibilities for understanding mediating pathways between collaboration and observed outcomes. It should be noted, to the extent that interorganizational collaboration did occur in comparison counties we would expect less pronounced differences in outcomes. Although we did not have data to illuminate the pathways by which collaboration might makes its effect on LBW, there is a growing body of literature examining intermediate outcomes of collaboration (Brown et al. 2010, 2011; Emshoff et al. 2007; Hallfors et al. 2002; Nowell and Foster-Fishman 2011; Zakocs and Edwards 2006). Ultimately, future studies should account for both variation in the extent or quality of collaboration, intermediate outputs of collaboration such as plan quality, implementation quality, and relevant health and well-being outcomes. There is a great deal of work to be done in identification of the essential aspects of collaborative functioning and intermediate outputs across the various types of collaboration being studied.

Although measurement of the nature of collaboration and services would have been a difficult task in this relatively large sample communities, the fact that we were able to examine outcomes in such a large sample of communities is one of the major strengths of the use of archival community outcome data. This approach also allowed us to examine outcomes over a longer period of time than would typically be afforded by original data collection. The availability of the same data for comparable communities allowed for the identification of a matched comparison group.

It should be noted that because this study was conducted entirely at the community level the ecological fallacy is potentially applicable (Greenland and Robins 1994). In the case of a variable like SES, there are likely unique effects to be estimated for both the individual- and community-level variance components of SES. Because we did not model the effect of individual-level SES, our estimate of the effect of county-level SES is not to be taken as an estimate of the effect of SES on LBW at the individual level. The ecological fallacy is germane to the present study in that the cause of the observed relationship between collaboration and LBW at the county level may operate at the individual level. For instance, if the observed effects were solely due to individuals receiving services, then the observed relationship between collaboration and community-level outcomes would be spurious given that the cause existed at the individual level. This relates back to the theoretical causal chain linking collaboration to outcomes. This study is agnostic regarding the level at which a possible causal effect of collaboration is active, and we assume that effects of collaboration occur via both individual-level and community-level pathways. Thus the present findings must be interpreted as potentially resulting from both individual and community-level effects.

These limitations notwithstanding, this study provides a test of effects of interorganizational collaboration on an indicator of community well-being. Low birth weight is a strong predictor of infant mortality as well as a number of other negative health and developmental outcomes throughout the lifespan. We have produced evidence of a relationship between collaboration and desirable change in LBW. We have done so within the practical constraints of an evaluation with a limited budget. We have demonstrated how propensity score matching techniques and archival community indicators can be used to study an existing network of mature collaboratives over a long period of time in a relatively large sample of communities. We anticipate further extension of research on community collaboration to provide a deeper understanding of the mechanisms by which collaboration leads to changes in community health and well-being.