Introduction

An environmental burden is any environmental factor that causes adverse effects [1]. Typically, members of lower socioeconomic status (SES) tend to experience heavier shares of these environmental burdens [2,3,4,5]. One example of an environmental burden is air pollution exposure, which is currently the top environmental risk factor leading to premature death globally [6, 7]. Nitrogen oxides (NO + NO2 = NOx) and fine particulate matter (PM2.5) in particular have been linked to adverse health effects including prenatal outcomes [8, 9], perinatal outcomes [10], cardiovascular diseases [6, 11,12,13], and respiratory ailments [6, 13, 14]. With over 66% of the world’s population projected to live in cities by 2050 [15], the health effects of air pollution are of concern as urban areas generally have higher levels of pollutants than rural zones [16].

Inequities in air pollution exposure [2,3,4,5, 17] and relevant health outcomes [18,19,20,21] have been identified previously, though there is less certainty in the research seeking their etiologies [5]. Health equity studies have proven challenging due to confounding socioeconomic factors, temporal lags, non-linear relationships between exposures and outcomes, and characteristics of the physical environment that may modify health outcomes related to exposures [22]. Associating a set of exposures with numerous health outcomes is a complex problem from an exposure science perspective [23]. Some disparities can be attributed to inequitable access to care or health behaviors associated with diet and smoking [19], but others may relate to the physical environment. For instance, land-use patterns of road density, proximity to greenspace (defined as public space specifically designated as a park), and tree canopy cover can alter air quality through physical or atmospheric chemical transformation processes [24,25,26,27,28,29].

In effect, health outcomes resulting from air pollution exposure could be affected by variables not easily delineated, as has been found in studies looking at the impact of ethnicity, age, and income brackets [30,31,32,33]. Further, when several variables characterizing inequities are highly correlated, isolating conditional and unconditional effects is particularly challenging.

Despite these challenges, air quality in cities is of particular concern as many source apportionment studies have shown that traffic sources are a significant contributor to pollution [16, 34,35,36,37,38,39,40,41,42]. People living in close proximity to roadways are at risk of high exposures to air pollutants [43], and many urban populations of low SES groups typically live close to roadways [17, 44, 45]. These findings, combined with studies attributing cardiovascular and respiratory diseases to air pollution exposure, suggest that members of low SES groups in cities are likely to have poorer health outcomes in relation to their greater exposure to ambient air pollution [6, 11,12,13,14].

This study aims to characterize how health outcome disparities compare to inequities in air pollution exposure, the built environment, and demographics. Two air pollutants, PM2.5 and NOx, were used as determinants of risk of four cardiovascular and respiratory diseases: chronic asthma, chronic obstructive pulmonary disease (COPD), coronary heart disease (CHD), and stroke. NOx is used in this study as it is the reactant of concern for Atlanta in the formation of ozone [46]. Health outcome risk was measured by prevalence from the CDC’s 500 Cities Health database [47]. This study also addresses factors of the built environment such as roadways and greenspaces, which are typically source and sink areas for air pollution. An advantage of this study is its use of multiple publicly available datasets at a spatial resolution finer than many other public data sets. While the initial application is to the Atlanta, GA, USA, area, the methods used can be applied elsewhere.

Atlanta, GA, is the ninth largest metropolis in the USA. Atlanta is home to a diverse population of almost six million and hosts the busiest airport in the world [48, 49]. The region also serves as the major transportation hub for all major interstate traffic that passes through the state of Georgia.

Methods and Materials

This study makes use of multiple data sources to gain information on the statistical associations among health outcomes, pollution exposure, and infrastructure characteristics. The study population was constrained to four cities in the Atlanta metropolitan area with available data for all data sources. The specific cities (Fig. 1) studied are: Atlanta, John’s Creek, Roswell, and Sandy Springs, and were chosen as they belong to the 500 Cities data set recently released by the US CDC These four municipalities vary in population densities and demographic profiles.

Fig. 1
figure 1

Municipalities and census tracts included in the study region

Data Sources

This study uses cross-sectional data sources that span 2006 to 2016. Data for health outcomes, infrastructure, and demographics were available at the census tract level, and air pollution data were available in fine spatial grids (250 m resolution) and then aggregated to census tracts.

Health data were available for 2014 [32], air pollution data were available for 2011 [50], and demographic data were obtained from the 2010 US Census. Tree canopy, road density, and park access data were provided by the Center for Geographic Information Systems (CGIS) and span from 2006 to 2016 [51].

Health Outcomes

Four health outcomes from the 500 Cities data [47] known from previous literature to be associated with poor air quality were selected for analysis. The health outcomes examined in this study are chronic asthma, chronic obstructive pulmonary disease (COPD), coronary heart disease (CHD), and stroke. Prevalence estimates for each outcome show similar spatial patterns, with high prevalence in southern Atlanta (Fig. 2).

Fig. 2
figure 2

Prevalence of each of the health outcomes by census tract. All outcomes show similar spatial patterns

The prevalence data provide crude estimates in each census tract, representing the proportion of adult respondents from the CDC’s Behavioral Risk Factor Surveillance System who reported receiving a diagnosis for each health outcome. These data only include prevalence among the adult population at a single time point, without information regarding the time of diagnosis or age at diagnosis. Prevalence of COPD also includes emphysema and chronic bronchitis. Prevalence of CHD also includes angina [32].

Air Quality

Owing to the complexity, cost, and general maintenance of air monitors, the number of air monitoring stations located in each state is low [52,53,54]. For instance, the EPA’s Chemical Speciation Network (CSN) lists seven monitors within the state of Georgia [55]. In areas lacking adequate monitoring stations, observational data may not accurately capture the spatial gradient in air pollution that affect exposure. Therefore, studies that rely on limited monitoring systems to characterize air exposure are subject to inaccuracies [56]. To combat anticipated errors that could result, this study utilizes data from two computational air quality models using advanced data assimilation approaches. Those results have been thoroughly evaluated against observations.

Results from the line dispersion model R-Line [57] and the photochemical model CMAQ [58] were combined using the R-Line/CMAQ fusion method to produce air quality estimates at a fine spatial resolution incorporating comprehensive emissions and chemistry [50]. R-Line, otherwise known as the Research Line Source dispersion model, assesses dispersion of primary roadway emissions near roadways [57]. The Community Multi-Scale Air Quality model (CMAQ) is a chemical transport model capable of regional scale photochemical air quality modeling [58,59,60]. R-Line provides the fine spatial resolution necessary for these analyses while CMAQ provides the needed chemistry and regional emissions.

Air quality data for carbon monoxide (CO), PM2.5, and NO x were originally developed for 250 m by 250 m spatial grids and were thoroughly evaluated using data withholding [50]. These data were aggregated to census tracts for this inequity study (Fig. 3).

Fig. 3
figure 3

Modeled levels of PM2.5 and NOx throughout metro Atlanta

Demographics

Data for age and racial compositions of census tracts were obtained from the US Census Bureau (www.census.gov). Census tracts are small, localized areas with populations typically ranging between 1000 and 5000 inhabitants, typically less than 1 km square in area. The data contain populations of racial groups, converted into percentages. An indicator variable was created for tracts that have a predominantly African-American population, defined as having more than 50% African-American residents. Age demographics were included as the percent of residents over the age of 65, as elderly residents are vulnerable to both cardiovascular and respiratory diseases. Youth populations are also known for being vulnerable, but could not be considered in this study because the CDC data reported the prevalence of each health outcome in adult populations only.

Infrastructure Metric for Roadways

Data for roadways were provided by the Georgia Institute of Technology Center for Geographic Information Systems (CGIS). In this study, the connectivity of roadways serves as a proxy for traffic, as well as pavement surface area. Roadway connectivity is defined as the ratio of links (road segments) to nodes (intersections) [51]. These data span from 2006 to 2016.

Green Infrastructure Metrics

The relationship between roadways, air quality, and greenspaces has been addressed [61, 62], but their overall effectiveness in improving health outcomes is less certain [24, 63]. This relationship is further complicated by known access inequities among different communities [64, 65]. This study uses tree canopy cover and access to greenspace as measures of green infrastructure. Tree canopy cover is measured as a percentage of the census tract that is covered by canopy on both public and private land. Greenspace access provides a metric of land access specifically designated as public parks as of 2006 (Fig. 4), as well as the availability of privately held greenspace and canopy cover along roadways. These data were provided by CGIS.

Fig. 4
figure 4

Infrastructure measures by census tract in the study region

Data Analysis

Analyses were run in R version 3.3.2 [66]. Continuous data were converted to z-scores to improve interpretability across variables. Analyses were conducted at the census tract level due to the aggregation of the health outcome data. The first analyses measured crude associations relating the pollutants and racial demographics separately to each health outcome using linear models with the individual pollutants and demographic indicator as the only predictor of each model. The second considered potential confounding by demographic and infrastructure characteristics, incorporating spatial autocorrelation in health outcomes among tracts. Spatial autocorrelation in this context represents expected similarities in health outcomes among nearby census tracts. To determine existence of spatial autocorrelation, Moran’s I [67] was calculated for each health outcome. Conditionally autoregressive (CAR) models [68] were run using the “spdep” package [69]. The CAR model and Moran’s I use inverse distance weighting, with distances measured between tract centroids. Explanatory variables used were air pollution, the indicator variable of having a predominantly African-American population, the percentage of elderly residents, park access, tree canopy cover, road intersection connectivity, and value of construction projects.

The final set of analyses used air pollution exposure rather than health effects as the response variable. These analyses also used CAR models with the remaining explanatory variables as the predictor variables. Similarly, these analyses used inverse distance weighting for spatial autocorrelation. The purpose of these analyses was to characterize census tracts with higher or lower levels of air pollution with regard to demographics and built environment.

Results

Analyses were conducted at the census tract level on four cities in the Atlanta metropolitan area. A total of 169 census tracts were included in these analyses. Of these, 117 were located in the City of Atlanta, 14 in John’s Creek, 16 in Roswell, and 22 in Sandy Springs. All 69 census tracts with a predominantly African-American population were located in Atlanta. Figure 5 shows the division in location between census tracts in the region that are predominantly African-American in population. Among census tracts with a predominantly African-American population, the median percentage of African-American residents was 93.3, and among tracts with a predominantly non-African-American population, the median percentage of African-American residents was 11.3. Although not part of the analysis, census tract mapping of various economic indicators showed near identical patterns (details available in the supplemental materials).

Fig. 5
figure 5

Physical separation of census tracts with predominantly Black population and predominantly non-Black population

Summary statistics of the continuous variables used are shown in Table 1. Exposure to NOx showed greater variability than exposure to PM2.5, due largely to PM2.5 being driven more by chemical reactions in the atmosphere, while NOx is directly emitted, and PM2.5 in Atlanta is dominated by secondary species formation [70, 71]. However, both pollutants follow the same trends concerning regions with higher and lower exposure automobiles emit both NOx and PM2.5. Variability among the prevalence of the four health outcomes is consistent, with the four outcomes having similar sample standard deviations.

Table 1 Descriptive statistics of data used in models

The first analyses provide crude estimates individually comparing how different pollutant exposures and racial demographics relate to cardiovascular and respiratory diseases of interest (Table 2). All models for PM2.5 showed a positive association between pollution levels and prevalence of each of the four health outcomes with significance at the 95% level. Models for NOx showed similar positive associations, but only the results for asthma and stroke were significant at the 95% level. The models with racial demographics showed that tracts with a predominantly African-American population had significantly higher prevalence of each health outcome.

Table 2 Model results for health outcomes

Values of Moran’s I for all health outcomes were significant (p < 0.0001), providing evidence of spatial autocorrelation (Table 2). Values were all positive, providing strong evidence that the data are clustered rather than spatially random. Values of Moran’s I for both pollutants were also positive and significant (p < 0.0001), implying spatial clustering (Table 3).

Table 3 Model results for pollution

Adjusting for demographics and infrastructure as combined factors in the second set of analyses found that the estimated associations between air pollution exposure to disease was no longer positive, suggesting that the demographic and infrastructure variable are more significant. Tracts that have predominantly African-American populations were again significantly associated with having higher outcome prevalence for each of the four health outcomes. Other variables significant for some health outcomes included park access (listed as Greenspace) and tree cover, as census tracts with greater park access and tree cover were associated with having higher prevalence of COPD, CHD, and stroke.

The results from the third set of models found that census tracts that have predominantly African-American population are associated with greater levels of both PM2.5 and NOx (Table 3). Census tracts with greater amounts of tree cover were associated with increased levels of PM2.5.

Discussion

Summary of Findings

This study gathered data at the census tract level and employed statistical models accounting for spatial autocorrelation to characterize inequities in cardiovascular and respiratory disease burden in metro Atlanta. This study also investigated inequities in exposure to particulate matter and NOx, two air pollutants that have been previously shown to relate to risk of cardiovascular and respiratory ailments, and the impact of infrastructure on these outcomes.

The findings of this study primarily show that racial demographics were the strongest predictor of each health outcome. This study also shows that at finer scales, health outcomes are more strongly associated with racial demographics but show greater variability in associations with infrastructure and air quality. The stratified scatter plots (Fig. 6) provide evidence that census tracts with predominantly African-American populations have, on average, greater prevalence of each health outcome as well as greater ambient levels of each pollutant. The first model results also showed that racial demographics are strongly associated with ambient levels of PM2.5 and NOx, and that exposure to those pollutants is associated with negative health outcomes.

Fig. 6
figure 6

Scatter plots stratified by racial demographics comparing pollution levels with health outcome prevalence

Using models that control for demographic characteristics, however, lead to counterintuitive results Decreased air pollution exposure was associated with higher prevalence of each outcome, though most associations were nonsignificant (Table 2). Lack of significance in these adjusted models is likely due to high correlations among predictor variables in the models. This suggests that the limited data set used here could not fully control for how demographics interacted with air quality and greenspace to impact health outcomes.

While some studies have reported on purported benefits of vegetation such as tree canopy on air pollution which could reduce adverse health outcomes, in this study, tree canopy and greenspace were associated with higher levels of both COPD (Table 2) and air pollution (Table 3). This result could be explained partly by studies that show that the presence of such greenspaces can worsen air quality or contribute to formation of PM2.5 [27, 72,73,74]. In addition, the combination of volatile biogenic organic compounds from trees and NOx can lead to the formation of ozone [28, 75, 76], which is known to degrade respiratory performance [77,78,79].

While predominately African-American populations have, on average, greater prevalence of all outcomes, there are other factors not considered in this study such as economic spatial distribution as reflected in the supplemental figures that could point to underlining causes for health inequities. Thus, while the results show that ethnic demographics is strongly associated with high health prevalence’s, it might be a proxy indicator for differences in economic inequity or access to health care services that might affect health outcomes [2].

The Pearson correlation coefficients (Table 4) indicate that the percentage of African-American residents in a census tract is strongly positively correlated with each of the four health outcomes as well as the two air pollutants (Table 4). However, using correlated predictors in models can lead to multicollinearity, which typically results in parameter estimates that are inaccurate. This explains the lack of significant association and the negative associations seen between pollutants and health outcomes in the adjusted CAR models. As a sensitivity analysis, the models were rerun using only the census tracts with a majority African-American population. These models were the same as the previous CAR models, with the removal of the term for having a predominantly African-American population. For all four health outcomes, the model terms for both pollutants remained less than zero and nonsignificant at the 95% level. Similar results were observed when restricting analyses to only the city of Atlanta.

Table 4 Pearson correlation coefficients for all data used in models

Comparisons to Literature

Using racial demographics, our study yielded results similar to the field of literature in that census tracts with predominantly lower SES populations, which tend to comprise mainly of African-American and other minority populations, were more likely to experience greater prevalence of disease as well poorer air quality.

This study utilized data published in the CDC’s 500 Cities Project, which considers health outcomes at a finer scale. Similar findings between race and health outcomes are discussed in other studies [1, 3]. In investigating other known relationships, such as those between health prevalence and greenspace, we found weaker and inverse associations at the census tract level, contradicting expectations but consistent with a number of studies [80,81,82,83,84]. In those studies, mixed results, inverse or nonsignificant associations are often reported when relating greenspace and certain health outcomes. Previous findings of this nature include negative associations between urban greenspace and cardiovascular and respiratory mortality that result from allergen exposure [81] and a positive correlation between greenspace exposure and obesity [83]. Some of these results have been explained by other relevant factors such as greenspace with low frequency of use not leading to lowered cardiovascular disease risk [85]. Low frequency of use may relate to socioeconomic factors such as education and safety concerns [86, 87]. Contrasting studies show greenspace associated with lower risk of heart disease [88] and areas with greater tree canopy having greater allergen exposure [89], secondary particulate matter, or ozone, which is known to be associated with respiratory disease [28, 90,91,92].

There also exists evidence linking improved mental health and health perception to greenspace exposure, examining depression [93, 94], perceptions [95], and aggression among adolescents [96, 97]. It is worth noting, however, that perceptions of health may differ from actual physical health [82] or the amount of increase in childhood physical activity [98]. Studies of this nature often target specific health outcomes among specific demographics, such as birthweight [99] and adolescent health. These topics were not examined in the present study.

Other effects of greenspace on health have yielded mixed results, such as the relationship between greenspace and obesity. This is likely due to confounding factors, as previous research has found mixed associations across age and gender groups [100]. Other research has shown that greenspace access is associated with higher obesity [101, 102], suggesting that other factors are relevant to obesity, including diet. Greenspace is also believed to influence health outcomes through changes in the surrounding environment, including urban heat islands [103]. Also noted is the possibility to see differences in relationships between greenspace and health across different locations, including variation within the continental USA [93] and across developed and developing nations [104].

Many of these studies exclude the impact of anthropogenic landscape form, which can affect health outcomes [105], particularly in urban areas, where decreased walkability has been observed to be associated with higher BMI following an increase in walkability [106]. Such findings highlight differences in effects of greenspace and other urban form characteristics across different locations. Defining urban form is also of importance; the present study found the unexpected result of intersection density being inversely related to air pollution and adverse health outcomes. This is unexpected, as traffic in cities has been shown to correlate strongly with air pollution [107]. A different metric to account for traffic may be more suitable.

Adverse environmental impacts from certain urban forms (such as roadway infrastructure) in some cases can reduce the benefits of having greenspace (i.e., increased physical activity). A recent Lancet report found that exposure to air pollution removed the positive impact associated with exercise [108]. Other works [106] showed the adverse impact of urban form through a walkability index and found high walkability was associated with higher depression prevalence in deprived neighborhoods, potentially due to exposure to derelict buildings and noise. Security and safety have also been shown to be factors relating to greenspace access for women and minorities in Atlanta [86]. As Figs. 4 and 5 show, areas with high park access are also areas with higher population of minorities. This could in part, explain the findings in this study, where tracts, despite having higher park infrastructure, had more adverse health prevalence’s despite access to greenspace.

Given the strong correlations among variables in the model, the results of this study do not reflect etiological relationships, but associations that characterize existing inequities in metro Atlanta. The findings of this study motivate future work to investigate etiologies of these inequities. Interventions designed to improve cardiovascular and respiratory health as well as to reduce pollution exposure would benefit from these findings by highlighting the regions that are in the greatest need. It is important to note the trends seen in health outcomes included in our model do not necessarily represent those of all health metrics. Only those outcomes associated with air quality were included. Due to differences in our model, our findings may differ from health studies which consider a wider array of outcomes.

Limitations

One limitation in this study is generalizability. The study region consisted of four municipalities in metro Atlanta. The results from the collection of these four cities may not be applicable broadly to the USA, but rather only to other metropolitan regions with similar characteristics to Atlanta. On the other hand, the availability of the 500 City data and the techniques applied here suggest similar analyses could be applied widely to develop a more thorough understanding of demographic, air quality, and greenspace relationships to health.

Another limitation lies within the correlations among predictors. The percentage of African-American residents, ambient levels of both pollutants, and prevalence of each of the four health outcomes showed similar patterns among the census tracts within the study region. Furthermore, the census tracts with higher proportions of African-American residents, higher pollution, and greater adverse health outcome prevalence were located in southern parts of Atlanta. In addition, this investigatory study was limited to using one metric of low SES groups, which was demographics, and did not include other economic factors such as income or education level.

Lastly, the collection of the health outcome data is limiting. Data represented aggregate prevalence of each health outcome within each census tract, and collection relied on self-reporting of previous diagnoses without including information regarding the age of the respondents when diagnosed or the year of diagnosis. This makes analyses of costs associated with these health outcomes or of effects of time-specific exposures impossible with the present data sources. Data regarding timing of pollution exposures and diagnoses would provide stronger conclusions, but are not publicly available. Another benefit to having individual-level health data rather than aggregated prevalence is that individual air pollution exposure can be evaluated. Existing data pertaining to traffic exposure are not as beneficial to this study as to other studies because they would need to be aggregated to the census tract level, as the pollution data were for the present analyses. Respondents were asked if they had been diagnosed by a medical professional. For stroke in particular, this introduces selection bias since the stroke cases consisted of individuals who had survived a previous stroke.

Future Directions

Further work on this topic aims to address the current study’s limitations and provide more insight into the etiologies of the inequities in both air quality and health outcomes. Access to longitudinal data and probabilistic non-linear data-based models [109] would make etiological conclusions more feasible than this cross-sectional study. Having individual-level data instead of these publicly available ecological data would improve generalizability and provide stronger analyses. Additionally, available data regarding social determinants of health, such as unemployment, income, or educational attainment, may contribute to the inequities in health observed in the data. This study focused on the use of built-environment characteristics to explain inequities in health outcomes in this region, and the results show that there are other factors that are likely to be stronger in explaining these inequities.

The design of this study highlights how publicly accessible data can be used to help municipalities understand the health state of their residents. While the mediating influence of infrastructure proved to be more difficult to support, cities asking similar questions may find answers through utilizing the methods outlined in this study. Future studies may explore other aspects of the built environment which influence health such as access to healthy food sources and health clinics. Infrastructure improvement data revealing recent development of an area could also be explored as it may reveal the relationship between tree canopy coverage and health outcomes, in addition to considering atmospheric chemical processes.

As the availability of finer scale health, air quality, demographic, and infrastructure data becomes available, newer models may investigate currently supported trends under a new lens. National, state, and county scaled health studies are useful in informing health policy and planning. As these trends are investigated at the census tract level, novel findings may provide informed insights into these complex relationships and thus guide policy and practice innovations.

Conclusions

This study provides evidence that areas with poor air quality carry a greater burden of cardiovascular and respiratory diseases. These areas have greater percentages of African-American residents, with many of these areas having predominantly African-American populations. Because demographics, air quality, and health outcomes are so highly connected, it is difficult to use cross-sectional data to confidently estimate causal effects.

These findings highlight challenges that face urban planners and policy makers. While there is clear evidence that regions that have predominantly African-American populations have poorer air quality and higher prevalence of the cardiovascular and respiratory diseases examined, etiologies are unclear, making targeted interventions difficult to design and implement because it is unclear which factors should be targeted for effective intervention. What is clear, however, is that these populations are burdened by both poor health and poor air quality. It is important to realize the outcome of this study reinforces the need for targeted interventions as populations in these areas are projected to grow rapidly.

Further investigation is needed in linking causal relationships between health, pollution, and infrastructure, the geographic and racial disparities in health and air quality are clear. Factors relating to infrastructure and sources of pollution are likely to differ between regions with predominantly minority populations and regions with predominantly non-minority populations, necessitating further investigation that quantifies the spatiotemporal dependencies of city-health linkages. Because spatial patterns in demographics, pollution levels, and disease prevalence were similar, challenges exist in determining the etiologies of these associations. However, the growing availability of fine-scale data can support a consistent, national level analysis.