Introduction

The Clean Air Act requires that the Environmental Protection Agency (EPA) sets standards (NAAQS; National Ambient Air Quality Standards) for pollutants considered harmful for human health and the environment. Periodically, these standards are reviewed and may be revised based on epidemiological studies. Due to this regulatory action, air pollution levels have been consistently declining over the past decades in the USA [1].

The identification of the shape of the concentration–response curve, and particularly whether there is a threshold exposure, can be a key issue in decision making for policy makers. The comprehensive characterization, therefore, of the effects of air pollution at low levels is warranted, as this may inform future standards and, subsequently, impact the health of millions of Americans, and people around the world.

Recent studies have highlighted that air pollution effects on adverse health are still observable in low concentrations, even well below current standards. To inform future risk assessment and design new and maximally efficient—both cost- and health-wise—policies and regulations, it is crucial to (1) characterize the shape of the concentration-response function at low concentrations, and (2) systematically assess the findings of these studies, i.e., to evaluate the association between air pollution and adverse health at these low concentrations.

For this review, we identified published articles that examine the impacts of exposure to fine particles (PM2.5; particles with aerodynamic diameter ≤ 2.5 μm)—a pollutant that has been consistently linked to numerous adverse health outcomes [2••, 3,4,5,6]—at low concentrations and cardiovascular and respiratory mortality and hospitalizations. We summarize our findings and provide some insight on methodological limitations of existing analyses that might hinder interpretation.

Methods

We conducted a literature search in the National Library of Medicine’s MEDLINE/PubMed database (https://www.ncbi.nlm.nih.gov/pubmed/). The inclusion/exclusion criteria were designed to identify original population-based research articles evaluating the impacts of exposure to low concentrations of PM2.5 on cardiovascular and respiratory mortality and morbidity. We included studies published in English from 2015 to 2018.

As part of the inclusion criteria, we only included studies with low air pollution concentrations defined as those that are below the current NAAQS. The primary NAAQS for PM2.5 are 12 μg/m3 for annual concentrations and 35 μg/m3 for daily concentrations. Given that PM2.5 levels are on average higher in Asia and Europe, and in general much higher than the current US NAAQS and WHO guidelines, our interest in the effects of exposure to low levels of air pollution led us to exclude these areas from our review to provide a comprehensive representation of the health effects of low exposure levels in the areas that such levels are observed or currently plausible.

We included studies examining either short-term (daily or few days moving averages) or long-term (annual or average of few years) PM2.5 exposures, with the outcome defined as a death or hospitalization due to cardiovascular disease (CVD), respiratory disease, or subcategories of these definitions. We placed no restrictions on the study design or analysis method used. We excluded studies only involving children, toxicology, climate, prenatal air pollution, Air Quality Index, pollutants other than PM2.5, those outside the USA or Canada, those with average long-term or average short-term PM2.5 concentrations above the corresponding NAAQS, and those not involving humans.

Given that most studies did not refer to the health impact of low concentrations exposure in the title or abstract, we defined the key words for our literature search to be general and inclusive. We used the following keywords: low particulate, PM2.5, particulate, admissions, hospitalization, mortality, morbidity, NAAQS, below standard, pollution, and deaths.

Summary of Findings

Using various combinations of the keywords, the search yielded a total of 1827 articles for mortality and 248 articles for hospital admissions. After excluding articles involving children, toxicology, climate, prenatal air pollution, Air Quality Index, pollutants other than PM2.5, not involving humans, and duplicates, a total of 530 articles remained. Of these, 60 articles were conducted in the USA or Canada, with outcomes defined as CVD or respiratory mortality and hospitalizations. Full-text review of the 60 articles was conducted, and 26 articles were found to meet our inclusion criteria.

Table 1 presents a summary of the 26 articles. For each study, we extracted information on location, the study’s time frame, population description, exposure assessment and temporal resolution, health outcome, methods used for the analysis, and main findings. Among the 26 articles, 19 were conducted in the USA, and 7 in Canada. Of these, 21 examined total or cause-specific mortality, 1 examined both mortality and hospital admissions, and 4 examined cause-specific admissions.

Table 1 Summary of the 26 articles

In terms of exposure assessment, 15 studies used PM2.5 estimated from spatio-temporal prediction models [2••, 3, 7, 8, 17, 18, 21, 22••, 23,24,25, 27,28,29,30], two [13, 14] used the EPA Downscaler (DS) model [31], five used PM2.5 concentrations measured at monitoring stations and downloaded from EPA’s Air Quality System (AQS) database or other county monitors [10, 15, 16, 19, 26], two used measurements from the Harvard Supersite in Boston [12, 20], and two studies compared results obtained using five [11] and ten [9] different exposure models. In terms of exposure duration, 15 studies examined mean annual PM2.5, ten studies daily PM2.5, and one study included both daily and annual averaged exposures.

PM2.5 concentrations have been decreasing during the last 10 years; therefore, many studies were in areas where the 75th percentile of the PM2.5 distribution was below the NAAQS. We grouped the papers in three categories: (1) studies conducted in areas where air pollution levels were below the NAAQS, (2) studies restricting the study population and analysis to low levels, (3) studies with a PM2.5 concentrations that could be above the NAAQS but that estimate concentration-response functions to specifically examine non-linear relationship between PM2.5 and the outcome allowing for a potentially different effect estimate at lower levels. We discuss findings for each category in detail next.

Studies Conducted at Low Levels

We identified 10 studies [7,8,9,10,11,12,13,14,15,16] that were conducted in areas with low PM2.5 levels (Table 1, Fig. 1). Five of these studies investigated associations with long-term exposures and five investigated associations with short-term exposures. Figure 1 presents the percent increase in the outcome for 1 μg/m3 increase in PM2.5, for each study.

Fig. 1
figure 1

List of the studies conducted at low levels. The results are presented as percent increase in the outcome for 1 μg/m3 increase in PM2.5

Among the five studies [7,8,9,10,11] that examined long-term PM2.5 exposure and total or cause-specific mortality, the annual average PM2.5 concentrations ranged between 8.3 and 12.5 μg/m3. Apart from Pun et al. [10] who used concentrations measured at monitoring stations, the other long-term exposure studies used predictions from spatio-temporal models, two of which used concentrations from different exposure models [9, 11]. The strongest association was reported by Crouse et al. [7] in a study in Canada with a 8.8% (95%CI 6.9, 10.7) increase in mortality per 1 μg/m3 increase in PM2.5, while the percent increases in the other studies varied between 0.9 and 5% increase in mortality. Effect estimates with PM2.5 were lower (e.g., 0.5% increase in mortality [12]) and largely indistinguishable from no association [13,14,15,16] among the short-term exposure studies. DeVries et al. [16] estimated a 7% decrease in mortality for 1 μg/m3 increase in daily PM2.5. The short-term exposure studies used PM2.5 concentrations from monitoring stations [12, 15, 16], while others used predictions from the EPA DS model [13, 14].

Studies that Restrict Analyses at Low Levels

Eight studies [2••, 3, 17, 18,19,20,21, 22••] did not explicitly focus on low concentrations but conducted additional analyses restricting the data to low levels to estimate the effects below the NAAQS. Of these studies, three examined associations with long-term exposures, four examined associations with short-term exposures, and one examined associations with both long- and short-term exposures. Figure 2 presents the percent increase in the outcome for 1 μg/m3 increase in PM2.5 for each study for the full study population and when restricting below NAAQS. Three of these studies [2••, 3, 22••] also estimated a non-linear concentration-response function.

Fig. 2
figure 2

List of the studies that restrict analyses at low levels. The results are presented as percent increase in the outcome for 1 μg/m3 increase in PM2.5 for the full population (Full) and when restricting to levels below NAAQS (Restricted)

All the studies found significant effects at low concentrations (with the exception of the Makar et al. study [17] that did not report statistically significant effect estimates for total mortality, only for admissions). Of the studies that examined long-term effects, all reported higher effect estimates at low concentrations compared to those in the full range of PM2.5. The majority of studies that examined short-term effects also found stronger associations at lower levels; these associations though were lower (with effect sizes below 1%) compared to the effect sizes of the long-term exposure studies.

In addition, one study [22••] in New England using the Medicare cohort applied a Poisson survival analysis to simultaneously estimate the short-term (2 days average) and long-term (365 days average) effects of PM2.5 and mortality. They observed that both short- and long-term PM2.5 exposures were significantly associated with all-cause mortality for the entire population, and that the estimated effects were higher when restricting the population to those with low concentrations.

Makar et al. [17] is the only study that dichotomized PM2.5 at 12 μg/m3 in the full cohort, using <12 μg/m3 as the reference category, and in a subsample restricted to concentrations <12 μg/m3, dichotomizing the exposure at 8 μg/m3. The authors reported that in this subsample, an increase of PM2.5 levels from below 8 μg/m3 to above 8 μg/m3 was significantly associated with cause-specific hospital admissions with estimated outcome increases varying between 0.7 and 2% per 1 μg/m3 increase in PM2.5 concentrations, depending on the outcome, but did not find evidence of a significant increase in mortality.

Studies that Estimate Non-linear Concentration-Response Functions

Eight articles published between 2015 and 2018 [23,24,25,26,27,28,29,30] examined potentially non-linear concentration-response functions between PM2.5 and the outcome across a range of exposures. One study examined the short-term effects, while the other seven used annual PM2.5 averages from spatio-temporal prediction models. All of them found an increasing monotonic shape, four of them found a steeper slope at lower concentrations (supralinear shape), including a study assessing short-term exposures, and four studies a linear monotonic increase. To note, most of the studies were conducted in areas with PM2.5 concentrations below the NAAQS.

In six out of eight studies, a type of a spline (such as natural or penalized splines) with a number of degrees of freedom varying between 2 and 4 was used to estimate the concentration-response function. One paper [30] employed restricted cubic splines with and without measurement error correction and reported de-attenuated effect estimates when adjusting for measurement error.

In Canada, where the 75th percentile of the PM2.5 concentrations were < 12 μg/m3 and therefore well below the NAAQS, Pinault and co-authors [25] applied the Shape Constrained Health Impact Function (SCHIF) developed by Nasari et al. [32]. The algorithm resulted in 16 to 21 shapes, and the SCHIF ensemble was then defined as a weighted average of the predictions of all models examined at any concentration with weights defined by the likelihood function value. The authors found that in general the best fitting shape for most causes of death was supralinear, with the greatest effect on the outcome estimated in the lower concentrations of exposure.

Methodological Challenges of Existing Studies

Study Designs

All 26 studies used different statistical models. For long-term exposures, all papers used Cox proportional hazards models (13 out of 26). All but two [17, 26] allowed for baseline hazards to vary with age and sex, four studies [7, 8, 25, 29] allowed for baseline hazards to vary with region, and two [18, 25] investigated the sensitivity of their results when stratifying on additional variables. For short-term exposures, time series (9 out of 26) and case crossover (4 out of 26) designs were used. Four studies utilized causal inference tools, such as propensity scores (IPW, marginal structural models, and sub classification; 3 out of 26) or instrumental variables (3 out of 26). Multi-pollutant models or models including multiple components of PM2.5 were considered in 13 out of 26 papers. Each study design and corresponding statistical approach is based on a different set of assumptions for confounding adjustment.

Study design is highly dependent on the exposure window (short- or long-term) and the outcome of interest (such as binary, count, or time to event). Wherever possible, researchers could study the robustness of their scientific results under different appropriate study designs.

Confounding Adjustment Across Exposure Levels

Eighteen out of 26 studies performed confounding adjustment by including potential confounders in an outcome regression model. In all studies of long-term exposure, age and sex was adjusted for in the Cox proportional hazards model, along with additional individual or ecological covariates. In cross-over study designs, individual covariates are balanced thanks to the study design and researchers adjusted for time-varying covariates. All but three papers [12, 19, 20] considered various model specifications including which variables are to be included, the functional specification of the covariate adjustment, and the presence of effect modification. In our opinion, studies should evaluate and report the robustness of their results under different model specifications.

Model results were interpreted for the whole range of observed exposures. By including covariates in the regression model, the same confounding adjustment is performed across all exposure levels. For that reason, regression models are often implicitly based on the assumption that confounders are equally strong (or weak) across all exposure levels. If covariates confound the exposure–response relationship by differing amounts at different exposure levels, the estimated confounding adjustment will be a compromise of the confounding strength across the exposure levels, with exposure levels with high sample sizes driving the confounding adjustment the most. Therefore, in the presence of different confounding across exposure levels, model results might be biased, especially at the very low levels where data are sparse [33]. For this reason, confounding adjustment should be localized to target the exposure level of interest rather than assuming, as these studies have, that the confounding structure is constant across exposure levels.

Restricting Analyses to Low Levels

Eight out of 26 studies considered a model fit to a subset of the data below a certain pollution level (annual or daily). Restricting the analysis to a subset of the data is useful in localizing the confounding adjustment, and for this reason, study results on the health effects of air pollution at these levels may be more informative.

However, restricting the analysis to a subset of the data has some interpretational limitations. Considering a subgroup of the data effectively changes the population of interest. Specifically, it is likely that the subpopulation exposed to low levels of PM2.5 does not have the same characteristics as the full study population. If the distribution of certain modifiers of the association between PM2.5 and the outcome of interest is different among participants living in lower exposure levels (e.g., rural vs. urban residence, age, socio-economic status, etc.) compared to the characteristics in the full population, then the effect estimates from the restricted analysis are not necessarily directly comparable to those of the full analysis. For example, the composition of PM2.5—a well-recognized modifier [34, 35]—likely differs across locations with different PM2.5 concentrations [36]. These should be taken into account when comparing estimated PM2.5 effects at low levels from restricted analyses to results from a full data analysis, since differences in the distribution of effect modifiers in the two groups of data could lead to estimates of the two analyses that correspond to different and not comparable populations. Additionally, this should also be considered when interpreting non-linear concentration-response curves, as differences in the distribution of modifiers across the PM2.5 concentration range could manifest as a non-linear association. Therefore, when a restricted analysis is performed, researchers should ensure that they provide descriptive statistics of the target population (all data) and the restricted population (subset on which analysis is performed). Finally, restricting analyses to study participants exposed at low levels might also induce selection bias when the interest lies in estimating the exposure effect for the full population at low levels, if this restriction also changes the distribution of the outcome of interest.

These challenges are always present when study results from one population are used to inform what would occur in a different population. For the reasons described above, findings from studies conducted using information on populations living in low levels of air pollution might not be generalizable to populations living at higher levels.

Exposure Assessment and Measurement Error

Seventeen of 26 studies relied on spatio-temporal models to assign exposure to individuals, while 7 studies used PM2.5 concentrations measured at monitoring stations. Although the error structure is known to vary by study design and exposure window examined, studies quantifying exposure measurement error at different scenarios have all shown a bias towards the null [37,38,39]. Prediction models allow researchers to assign exposures to participants living in areas with limited or no monitors. Although use of such prediction models induces less exposure measurement error compared to use of monitors, these are still subject to potential measurement error [39]. Only one of the examined studies attempted to correct for this measurement error [30], and three conducted sensitivity analysis to evaluate robustness of results using predictions from prediction models compared to observed concentrations at monitoring locations [2••, 3, 30]. In addition, a different study compared circulatory mortality effect estimates obtained using different prediction models and observed significantly harmful associations in every model, albeit there was some variability in the hazard ratios estimated [9]. Similar findings were obtained in another study that compared cause-specific mortality risks in urban vs. rural areas using five different exposure models [11]. While there is some literature attempting to correct for measurement error in exposure in this context [37, 40, 41], to date, no study has explicitly focused on measurement error arising in low concentrations of air pollution or the potential differing measurement error across the distribution of PM2.5 concentrations. As these spatio-temporal prediction models are developed overall and not specifically for low concentrations, their predictive accuracy may differ across the range of the observed PM2.5 concentrations, and subsequently the measurement error structure may also differ by exposure levels. The uncertainty in these predictions, even in the absence of bias, also likely differs across the range of the observed PM2.5 concentrations. Lastly, large measurement error at low exposure levels is likely to lead to reduced statistical power in detecting health effects at low levels.

Discussion

We searched the literature between 2015 and 2018 to identify articles that characterize the relationship between short- and long-term exposure to PM2.5 and cardio-respiratory hospital admissions and mortality. We identified 26 papers that satisfied our search criteria. These papers used different statistical models, exposure assessment methods, and different ways to evaluate effect estimates at low levels. Nonetheless, most of the included studies reported statistically significant harmful associations even at levels below the current standards.

As the PM2.5 concentrations are declining, there is increasing interest in quantifying the PM2.5 effects on adverse health at these lower concentrations. Past studies had reported no threshold in associations with air pollution [42,43,44]; the concentration distributions of those studies, nonetheless, do not reflect current and future levels. We classified the articles in this paper into three distinct categories, one of which was studies that flexibly examined the concentration-response curve across the observed PM2.5 range. Half of those studies reported a linear association, while the other half reported a supralinear association, with larger effect estimates at lower levels.

These results were in agreement with results from studies in which sub-analyses restricted the study population to a sample only exposed at low levels. The effect estimates of these sub-analyses were almost always higher than the effect estimates obtained in the full population (Fig. 2). Nonetheless, these results should be interpreted with caution. It is likely that the sub-populations that live at low levels are not exchangeable with the populations that live at higher levels. If the distribution of potential modifiers differs across the range of the PM2.5 concentrations, then this could explain the changes in the estimated effects above and below the standards. PM2.5 levels, for instance, have been found to vary across census tracts with differing age, racial, and socio-economic distributions [45], and these factors have also been identified as known modifiers [2••, 46, 47]. Moreover, locations with higher PM2.5 concentrations are likely to have different constituent composition [36], which in turn has been found to modify the association between PM2.5 and adverse health [34]. Direct comparisons, therefore, between effect estimates above and below the standards should be avoided unless accompanied with comparison of population and composition characteristics.

There have been a few studies recently that characterize the concentration-response function between PM2.5 and health outcomes. For instance, Burnett et al. [48] developed integrated exposure–response (IER) that covered the global range of exposure by integrating available information on rate ratios from studies of ambient air pollution, second hand tobacco smoke, household solid cooking fuel, and active smoking. Pope et al. [49] subsequently applied this IER approach and observed a steeper association between PM2.5 and CVD at low exposures that leveled off at higher exposures, concluding that breathing combustion-related fine particulate matter from multiple sources contributes to CVD risk. In another study, Burnett et al. [50] used the Global Exposure Mortality Model (GEMM), developed by Nasari et al. [32], and information on 41 cohorts from 16 countries, and observed a supralinear association between PM2.5 and non-accidental mortality, with a different shape for each cause of death. This study, however, focused on a wide range of concentrations, not solely focusing on low levels. Finally, a recent meta-regression also reported a non-linear concentration response curve with a steeper slope at lower levels, in agreement with previously reported estimates [51].

To conclude, there is strong evidence suggesting that particle exposures remain harmful even at levels that are lower than the current US standards. Future studies could address the potential methodological limitations which may be present in the studies examined and are related to known forms of bias, i.e., residual confounding, selection bias, and exposure measurement error. Also, they could address potential interpretational limitations; if the population characteristics and the particle composition is not comparable at lower and higher concentration levels, then the estimated health effects may not be comparable either. Despite these, all studies considered in this review have different study designs, different populations, different exposure assessment approaches, and different confounding adjustment. Therefore, it is highly unlikely that the estimated effects at low PM2.5 levels are fully attributed to these limitations, and we consider the consistency of the study results to indicate the presence of negative health effects even at low PM2.5 levels.

Conclusions

The body of evidence to date strongly suggests that no threshold exists in the association between PM2.5 and adverse health, and that no levels are safe. The effect estimates reported in the studies cited here suggest that the NAAQS may need to be reevaluated; improving air quality with even lower PM2.5 than currently allowed by US EPA standards will benefit public health.