Introduction

As the IARC (International Agency for Research on Cancer) reported in 2020, breast cancer, now overtaken lung cancer, has become the most commonly diagnosed cancer globally (Siegel et al. 2020). Population-based cancer registry data have identified significant increases in the incidence of breast cancer over the last three decades, particularly among young women aged 25–39 years (Johnson et al. 2013). The high incidence has failed to be fully illustrated by the well-established genetic and lifestyle risk factors including genetic mutations, breastfeeding, smoking, family history, hormonal therapy, alcohol consumption, prior benign breast disease, and body mass index (BMI) (Bray et al. 2018; Sahay et al. 2019). At the same time, environmental risk factors such as ambient air pollution (AAP) have been proposed to explain the high breast cancer incidence. AAP is ubiquitous, albeit subjected to high spatial and temporal variability and by concentration and constitution. So it has recently been classified as a whole, and particulate matter (PM) specifically, as carcinogenic to humans by the IARC in 2013 (Loomis et al. 2013). Experimental studies have provided some evidence supporting a link between carcinogens presented in AAP (e.g., polycyclic aromatic hydrocarbons (PAHs) and benzene) and breast cancer, based on the effects of oxidative DNA damaging and estrogen disrupting (Chen et al. 2013; Mordukhovich et al. 2010), which makes it biologically plausible to explore the relationship between exposure to AAP and risk of breast cancer among humans.

Over the last couple of years, an emerging body of studies has evaluated the associations of long-term AAP exposure with the risk of breast cancer but yielded mixed results (Andersen et al. 2017a; Andersen et al. 2017b; Bai et al. 2020; Cheng et al. 2020; Cohen et al. 2018; Crouse et al. 2010; Datzmann et al. 2018; Goldberg et al. 2017; Goldberg et al. 2019; Hart et al. 2016; Hystad et al. 2015; Raaschou-Nielsen et al. 2011; Reding et al. 2015; To et al. 2015; Villeneuve et al. 2018; White et al. 2019; White et al. 2021). Additionally, these studies varied in study design, study location, study population, air pollution assessment approach, breast cancer ascertainment, as well as covariates adjustment. Thus, a comprehensive systematic review and meta-analysis is needed to quantify the association. To the best of our knowledge, three previous meta-analyses (Keramatinia et al. 2016; Yi et al. 2017; Zhang et al. 2019) have reported associations of exposure to AAP and risk of breast cancer. Unfortunately, Keramatinia et al. (2016) only analyzed the correlation between NO2 exposure and breast cancer after including three ecological studies and two individual studies, while Yi et al. (2017) inappropriately evaluated overall air pollution exposure and breast cancer risk. Zhang et al. (2019) just reported the effects of PM2.5 and PM10 exposure on breast cancer incidence and mortality but ignored that of gaseous pollutants. Here, we developed a framework consisting of population, exposure, comparator, outcomes, and study design (PECOS) to further clarify the relationship of exposure to AAP and breast cancer incidence based on the current evidence. We aimed to answer the following question: “In the general women, what effect does long-term exposure to AAP exert on breast cancer incidence?” Herein, a systematic review and meta-analysis was performed under the guidelines of the NTP/OHAT (National Toxicology Program/ Office of Health Assessment and Translation) Handbook for Conducting a Literature-Based Health Assessment. This handbook supports a standardized methodology to apply the GRADE (Grading of Recommendations Assessment, Development, and Evaluation) approach to environmental health assessments. In addition, we also compared the study protocols of the existing systematic reviews and those of our study.

Methods

Search strategy

This study was presented according to the preferred reporting items for systematic review and meta-analysis (PRISMA) statement, which was exhibited in detail in Table S1. Epidemiological studies which reported the effects of long-term AAP exposure on breast cancer incidence were screened from 3 major academic databases, Web of Science, PubMed, and EMBASE, before March 10, 2020, and a follow-up search was carried out on February 1, 2021, for further identification of any eligible published data. Search strategies were formulated based on combinations of air pollution terms and breast cancer terms. Detailed search strategies were described in Table S2. We also screened the reference lists of eligible literature and related reviews to identify additional studies.

Study selection

Based on the predesigned PECOS framework, we produced the eligibility criteria as follows: (P) the study was conducted among humans; (E) the study investigated long-term (annual mean) exposure to AAP; (C) the study quantitatively provided the effect estimations; (O) breast cancer was confirmed by clinical assessment of the International Classification of Diseases (ICD), medical registry data or other reliable methods; and (S) epidemiological studies adopted cohort, case-control or cross-sectional study designs. Studies were excluded when the correlation relationships were identified, without health effects reported or indirect measures of AAP, such as distance to roadways or traffic density at the nearest road. Studies that only focused on the constituents of AAP or the outcome of breast cancer mortality were also excluded. Reviews and animal studies were excluded as well. Preliminary selection of the articles was done by checking the titles and abstracts by two (WW and BJW) of the co-authors independently after duplicates are removed. Then, the texts of the residual articles were evaluated in full to verify their suitability (Fig. 1). The article would be referred to a third co-author (CYH) for adjudication if there was a dispute.

Fig. 1
figure 1

Flowchart of the study selection process

Data extraction and assessment of methodological quality

We predefined a data template to extract the following elements of eligible studies: reference (study design), country (study period), outcome definition and incidence, air pollutant(s) and its (their) distribution, exposure assessment approach, exposure time and adjustment covariates in the analysis. The information extraction work was conducted by two authors (WW and BJW) independently with a standardized form. Any discrepancy was solved through group discussion prior to the meta-analysis. Meanwhile, we also contact the authors of studies included for information that has not been reported. Besides, we employed the NOS (Newcastle-Ottawa scale) to assess the quality of studies with cohort or case-control design (Yang et al. 2020). Using a “star system,” each eligible study was evaluated on 8 items with 3 subscales: (1) how study groups were selected; (2) how the groups were compared; and (3) how to determine either the exposure or outcome of interest for case-control or cohort studies. One star was equivalent to one score, and each study was given a maximum of 9 scores. A study with a score ≥ 7 was considered as “high quality;” and otherwise, the study was considered as “low quality.”

Risk of bias assessment

The assessment of the risk of bias (RoB) for each individual study was thoroughly carried out. RoB assessment is related to methodological quality assessment but also represents differences. That is, RoB embodies the quality of a study that may reveal systematic errors in effect estimate (Woodruff and Sutton 2014). Studies that conformed to high methodological qualities may still result in a high RoB, which ultimately influences the extent or direction of an association. There now lacks a standardized and validated framework for assessing RoB in environmental epidemiological studies. RoB assessment tools include risk of bias in nonrandomized studies of interventions (ROBINS–I), which was designed for nonrandomized studies of interventions (Sterne et al. 2016), the national institute for environmental health sciences (NIEHS) OHAT RoB assessment, and the RoB assessment that is used for the updated WHO Air Quality Guidelines (AQGs). These tools all have their own strengths and limitations, and we used a domain-based tool that was developed by NTP/OHAT to support our final conclusions (NTP/OHAT 2019). The RoB tool consists of a common set of questions whose answers are based on the specific details of individual studies to formulate RoB ratings for each domain. For observational human studies, seven domains were considered to establish the classification criteria. Three key domains are confounding bias, detection bias for exposure characterization, and detection bias for outcome assessment, while four other domains include attraction/exclusion bias, selective reporting bias, selection bias, and conflict of interest. Ratings for RoB assessment were categorized into “definitely high,” “probably high,” “probably low” and “definitely low” for each domain. If studies were insufficient in information for RoB identification, the category of “probably high” was designated. Finally, we appraised each individual study into tier 1, 2 or 3 of RoB. The overall body of evidence was then divided into levels of “not likely,” “serious” or “very serious” RoB.

Data synthesis and analysis

The systematic review was aimed to quantitatively provide a summary effect estimate of breast cancer risk with per unit increase in air pollutant exposure. With this aim, we combined the results of the single studies which were appropriate to be merged. If three or more studies were identified for the same air pollutant–breast cancer combination, a meta-analysis was done. Before data pooling, we converted the effect estimates in relation to NOx into NO2 effects, with a conversion factor of 0.75, since only two studies reported the effects of NOx. The NO2 concentration expressed in ppb was converted into μg/m3 by means of the standard WHO scaling factors; for example, our conversion factor was 1.88 (at 25 °C and 1013 mb) for both NO2 and NOx.

Because all included studies used a continuous exposure analysis, we only extracted the effect estimates from a continuous exposure analysis if both continuous and categorical exposure analyses were both reported by one study. If studies presented results of more than one air pollution exposure assessment approach, we entered the effect estimates produced by the exposure measurement approach (land-use regression model, LUR) that was mostly used in the other studies as the main analysis. Meanwhile, we considered the inclusion of the effect estimates to be sensitivity analyses if these estimates were extracted from the unadjusted satellite method, scaled satellite method, or others. Goldberg et al. (2019) reported the breast cancer risk separately by the menopausal status of women participants in their study. Combined effect estimate was calculated according to a single effect estimate corresponding to premenopausal or postmenopausal status using a fixed-effect model meta-analysis. Then the calculated estimate was entered in the meta-analysis. Three studies (Andersen et al. 2017b; Crouse et al. 2010; Goldberg et al. 2017) reported the risk of breast cancer only among postmenopausal women and were included in the meta-analysis directly. We used hazard ratios (HRs) and odds ratios (ORs) of breast cancer risk in a random-effects model and thus included the effect estimates in the meta-analysis regardless of the study designs. The effect estimates and 95% confidence intervals (CIs) from each individual study were extracted from single-pollutant models, with the most full confounders adjusted. HRs were used as a measurement of association across all studies because most included studies were cohort studies, and ORs were treated as an equivalent measurement of HRs. To permit comparisons of effect estimates between studies, the reported estimates were converted to common exposure increments. Thus, for each study, an OR and 95% CI were obtained for the association between breast cancer incidence and each 10 μg/m3 increase in PM2.5, PM10 or NO2 exposure, which presumed a linear exposure-outcome relationship.

τ2 (tau-squared) was used to describe the between-study variance in the random-effects model. Heterogeneity was evaluated using Cochran’s Q and I2 statistics. Cochran’s Q exhibits a test of significance for heterogeneity, and a p value of less than 0.1 was deemed as statistically significant. I2 statistic provides a quantitative indicator for heterogeneity and degree of inconsistency among studies, respectively. The results were interpreted following Cochrane criteria: (1) If I2 was between 0 and 40%, heterogeneity might not be important; (2) if I2 was between 30 and 60%, it may represent moderate heterogeneity; (3) if I2 was between 50 and 90%, it may represent substantial heterogeneity; and (4) if I2 was between 75 and 100%, it may represent considerable heterogeneity. If significant heterogeneity was observed, an attempt would be made to explain the source of heterogeneity by subgroup analysis. Subgroup analyses were carried out around the following issues to detect potential confounders, as applicable: (1) study design (cohort and case-control); (2) study location (Asia, North America, and Europe); (3) menopausal status (premenopausal and postmenopausal); and (4) estrogen/progesterone receptor status (ER+/PR+ and ER-/PR-). Also, we conducted sensitivity analyses by excluding one study at each time and recalculating the summary effect estimates to evaluate its influence on the summary effect estimates. Publication bias was visually assessed by using a funnel plot for asymmetry and Egger’s test; a p value of less than 0.1 represented the existence of publication bias. Additionally, we used the trim-and-fill method to test and adjust for publication bias (Duval and Tweedie 2000). All our analyses in this study were done by STATA® version 13.0 (Stata Corporation, TX, USA).

Confidence of cumulative evidence and certainty of the evidence

The quality of evidence for each air pollutant–breast cancer combination was evaluated following the modified version of the GRADE system that was commonly used for rating the confidence of cumulative evidence. The NTP/OHAT handbook has provided a more detailed guidance on rating the confidence in cumulative evidence as “very low,” “low,” “moderate” and “high.” In brief, eligible human studies were initially classified by key features of study design, while observational study usually starts with a “moderate” level. The potential downgrading of the confidence rating was considered in the case of factors that may decrease the confidence of the results. Potential upgrading of the confidence rating was considered in the case of factors that may increase confidence in the results. Next, the confidence ratings were translated into certainty of evidence of health effects for each air pollutant–breast cancer combination according to one of the five statements: “high,” “moderate,” “low,” “inadequate” or “evidence of no health effect.”

Results

Literature search and study characteristics

As exhibited in Fig. 1, a total of 5446 articles in total were qualified for titles/abstracts screening after removing duplicates. The full texts of 54 relevant articles were retrieved for further selection after the completion of titles/abstracts screening. Finally, we retained 18 articles (including 2 articles identified through updated search) that exactly fulfilled our inclusion criteria and were thus incorporated in our study. Table 1 and Table S3 summarized the characteristics of the 18 included studies in detail. Thirteen of the 18 studies were conducted in North American, three in Europe, one in Asia, and one in multiple countries. Most of these studies were performed in a cohort design (n = 15), and the remaining three studies used a case-control design, while the sample size of these studies was ranging from 799 to 8.5 million. Of the 18 studies, 16 were performed among the healthy population (Andersen et al. 2017a; Andersen et al. 2017b; Bai et al. 2020; Cheng et al. 2020; Crouse et al. 2010; Datzmann et al. 2018; Goldberg et al. 2017; Goldberg et al. 2019; Hart et al. 2016; Hystad et al. 2015; Raaschou-Nielsen et al. 2011; Reding et al. 2015; To et al. 2015; Villeneuve et al. 2018; White et al. 2019; White et al. 2021), one among coronary patients (Cohen et al. 2018), and one among cancer cases (Coleman et al. 2020). As to the exposure assessment approach, most of the 18 studies used exposure estimation models including LUR model (Andersen et al. 2017b; Bai et al. 2020; Cheng et al. 2020; Cohen et al. 2018; Crouse et al. 2010; Datzmann et al. 2018; Goldberg et al. 2017; Goldberg et al. 2019; Hystad et al. 2015), air quality model (Bai et al. 2020), GEOS-Chem with GWR model (Bai et al. 2020), CALINE4 (Cheng et al. 2020), spatiotemporal models (Hart et al. 2016), integrated empirical geographic regression models (Coleman et al. 2020), and different models for different air pollutants studied (White et al. 2021); geostatistical interpolation methods including kriging (Cheng et al. 2020; Reding et al. 2015; White et al. 2019); satellite-derived estimates (Hystad et al. 2015; To et al. 2015; Villeneuve et al. 2018); or some native air pollution assessment system, such as Danish integrated air pollution modeling system (Andersen et al. 2017a) and Danish AirGIS modeling system (Raaschou-Nielsen et al. 2011). Table S3 exhibited the characteristics of outcome assessment among the included studies. Most studies ascertained breast cancer cases via linkage to national or regional cancer registries (Andersen et al. 2017a; Andersen et al. 2017b; Bai et al. 2020; Cheng et al. 2020; Cohen et al. 2018; Goldberg et al. 2019; Coleman et al. 2020; Hystad et al. 2015; Raaschou-Nielsen et al. 2011; Villeneuve et al. 2018), routine healthcare/administrative databases (Datzmann et al. 2018; To et al. 2015), or self-administered questionnaires and then confirmed by medical record (Hart et al. 2016; Reding et al. 2015; White et al. 2019) or cancer registry in states (White et al. 2021). Crouse et al. (2010) and Goldberg et al. (2017) performed their studies with case-control design and recruited women who were treated with breast cancer as cases. All the cases were defined by the ICD codes or International Classification of Diseases for Oncology (ICD-O) codes.

Table 1 Characteristics of included studies evaluating the association of long-term ambient air pollution exposure with breast cancer incidence

Assessment of study quality and risk of bias

All the 18 studies were evaluated with a minimum score of 7 under the NOS checklist and were thus identified as high-quality studies (Table S4). RoB assessment for each individual study was shown in Table 2. First, in terms of confounding bias, we predefined that if a study included most of the key confounding variables such as race, age, BMI and at least one indicator of socioeconomic status (i.e., education or income) which may be in direct association with exposure levels of air pollution and thus breast cancer, it was then rated as “probably low” RoB. If a study further included smoking status, alcohol consumption and hormone therapy, it was rated as “definitely low” RoB. Based on this predefinition, three studies (Bai et al. 2020; Cohen et al. 2018; Datzmann et al. 2018) did not sufficiently adjust key confounders such as race, age, BMI or any indicator of socioeconomic status, so they were rated as “probably high” RoB. For the exclusion bias, four studies (Cohen et al. 2018; Hart et al. 2016; Raaschou-Nielsen et al. 2011; Reding et al. 2015) did not specify the exclusion criteria and were thus rated as “probably high” RoB. For the outcome detection bias, 17 studies were classified as “probably low” RoB, and one study (Reding et al. 2015) that provided insufficient information to justify the assessment was rated as “probably high” RoB. For the exposure detection bias, only one study (Coleman et al. 2020) assessed county-level air pollution as proxy for individual exposure, so it was rated as “definitely high” RoB, whereas the remaining studies were classified as “probably low” RoB. For the selective reporting bias, all studies included were rated as a “definitely low” RoB because all measured outcomes were reported with sufficient detail. For the selection bias, two hospital-based case-control studies (Crouse et al. 2010; Goldberg et al. 2017) were classified as “probably high” risk of selection bias because of the limitation of representativeness. Three studies only included cancer patients (Coleman et al. 2020), black women (White et al. 2021), and patients undergoing percutaneous coronary interventions (Cohen et al. 2018), respectively, so they were appraised of having “definitely high” risk of selection bias. No author of the 18 included studies declared a conflict of interest or that their studies were sponsored by public funds, so they were rated to be of “probably low” RoB. In summary, the included studies were appraised as either tier 1 (n = 12) or tier 1 (n = 6) for RoB assessment when referring to the NTP/OHAT criteria, indicating the presence of plausible bias that may raise some doubt about the results.

Table 2 The assessment of risk of bias of each included study using the National Toxicology Program/Office of Health Assessment and Translation (NTP/OHAT) risk of bias rating tool

Ambient air pollution exposure and the risk of breast cancer

NO2 exposure and breast cancer incidence

Figure 2 described the summary effect estimates of the eligible studies that explored the associations of long-term AAP exposure with breast cancer incidence. A total of sixteen studies investigated the association of NO2 (including two of NOx scaled) exposure with breast cancer. Under the random-effects model, NO2 exposure was in significant association with increased risk of breast cancer (HR per 10 μg/m3 = 1.02, 95% CI = 1.01, 1.04; I2 = 46.8%; τ2 = 0.0002). The results of the publication bias assessment plot in combination with Egger’s test (p value = 0.024 < 0.1) indicated a potential occurrence of publication bias (Fig. 3). An additional trim-and-fill method identified six missing studies for NO2, and the adjusted estimate was not substantially changed (HR = 1.01, 95% CI = 1.00, 1.03). When it comes to our sensitivity analyses, any individual study or some particular studies were eliminated, and the summary effect estimates were not substantially changed, showing the results were robust (Table S5). The summary effect estimate for NO2 exposure was 1.02 (95% CI = 1.01, 1.04; I2 = 51.2%; τ2 = 0.0003) when the two studies were excluded (Andersen et al. 2017b; Raaschou-Nielsen et al. 2011), which reported the effects of NOx exposure on breast cancer incidence at the same time. Meanwhile, the summary effect estimates also did not exhibit sharp change (results not shown) when the extracted effect estimates were entered, which were produced by other exposure assessment methods. As shown in Table 3, in the subgroup analyses, there showed no statistically significant differences in associations of NO2 exposure with risk of breast cancer between different geographic areas, menopausal status, and estrogen/progesterone receptor status, whereas statistically significant difference was found in different study design, which may be the source of heterogeneity.

Fig. 2
figure 2

Forest plots for the associations between long-term A NO2, B PM2.5, and C PM10 exposure and breast cancer incidence. HR, hazard ratio; CI, confidence interval; NO2, nitrogen dioxide; PM2.5, particulate matter with aerodynamic diameters less or equal than 2.5 μm; PM10, particulate matter with aerodynamic diameters less or equal than 10 μm

Fig. 3
figure 3

Bias assessment plots for the associations between long-term A PM2.5, B PM10, and C NO2 exposure and breast cancer incidence. PM2.5, particulate matter with aerodynamic diameters less or equal than 5 μm; PM10, particulate matter with aerodynamic diameters less or equal than 10 μm; NO2, nitrogen dioxide.

Table 3 Summary effect estimates (HRs) and 95% confidence intervals (CIs) of the association between long-term ambient air pollution exposure (per 10 μg/m3 increase) and breast cancer incidence by subgroup

Particulate matter exposure and breast cancer incidence

The relationships of exposure to PM2.5 and PM10 with the incidence of breast cancer were assessed by eleven and seven studies, respectively (Fig. 2). None of the associations were statistically significant under the random-effects models (PM2.5: HR = 1.03, 95% CI = 0.99, 1.06; PM10: HR = 1.04, 95% CI = 0.98, 1.10). No between-study heterogeneity was found for PM2.5 (I2 = 8.2%; τ2 = 0.0003), while moderate between-study heterogeneity was found for PM10 (I2 = 70.3%; τ2 = 0.0031). The results of the bias assessment plots combined with Egger’s tests showed no publication bias for PM2.5 (p value = 0.51 > 0.1), while the existence of publication bias was found for PM10 (p value = 0.06 < 0.1) (Fig. 3), our trim-and-fill analysis identified four missing studies for PM10, and the adjusted estimate was not substantially changed (HR = 0.99, 95% CI = 0.94, 1.03). In sensitivity analyses, when a single study was omitted one by one, the summary effect estimates for PM2.5 and PM10 were not substantially changed, which shows the robustness of the results (Table S5). Additionally, the summary effect estimates also did not change markedly (results not shown) when the effect estimates produced from other exposure assessment models were entered. As shown in Table 3, subgroup analyses indicated similar results, with the exception of significantly increased risk of PM10 exposure in Europe. The association of O3 exposure with breast cancer incidence was assessed by two studies (Bai et al. 2020; White et al. 2021) which reported a nonsignificant positive association. We did not perform meta-analysis for O3-breast cancer combination in consideration of the limited data.

Confidence of cumulative evidence and certainty of the evidence

The NTP/OHAT framework demonstrates that the category of “high confidence” only included controlled and experimental studies in the first place. Since all of our included studies were observational studies, we set up the rating process starting from “moderate confidence.” This is because observational studies bear the risk of unmeasured confounding.

Downgrading factors

As the NTP/OHAT handbook states, researchers should reserve the decision to downgrade the confidence rating because the RoB should be reserved, in case of substantial RoB across most of the included studies which compromise the body of evidence. The eligible studies were rated into “tier 1” or “tier 2” RoB, so we estimated the that initial confidence rating would not be weakened by the RoB. All studies included were consistent with our predefined eligibility criteria under the PECOS statement, so we omitted downgrading the initial confidence rating due to indirectness. Since the NTP/OHAT handbook demonstrates that there was no ideal single measure of consistency, the three factors, namely similarity of point estimates, the extent of overlap between CIs, and results of heterogeneity tests, were taken into consideration when determining whether to perform downgrading due to unexplained inconsistency. There, we did not downgrade the initial confidence rating of the three exposure-outcome combinations based on these factors. Meanwhile, the NTP/OHAT handbook defines 95% CIs as the primary approach to assessing imprecision. When it came to ratio measures—HR in this study—the ratio of the upper to lower 95% CI for meta-estimate was less than 10, and all the ratios of that in this study did not exceed 10. Therefore, the initial confidence rating was not downgrade for the lack of imprecision issues. Publication bias was significant for two of the three exposure-outcome combinations tested. However, it is widely recognized that this statistical method is inadequate in the accuracy of evaluating publication bias in the case of a small number of studies included. Besides, the NTP/OHAT handbook advises a more extensive evaluation to get involved in studies that were supported by industrial sponsors and nongovernment organizations; in early positive studies, particularly if small in size; and in searching for identifying the abstracts of unpublished studies, conference papers, or other types of grey literature. In brief, publication bias was not taken into consideration when pondering whether downgrade the initial confidence rating since most studies were sponsored by public funds, while thesis, conference papers, or abstracts for unpublished studies were not found.

Upgrading factors

Unmeasured confounders have been listed as the most potential factors that potentially upgrade the confidence rating. However, there lacks evidence to determine the strength of how an unmeasured confounder is associated with the outcome of interest, so the initial confidence rating was not upgraded. Also, we did not upgrade the initial confidence rating because the included studies did not report a clear (non)monotonic dose-response relationship.

Overall, we did not find out compelling factors which may adjust the initial confidence ratings for the studied air pollutant–breast cancer combination among studies, and the confidence in the body of evidence was rated as “moderate” for each exposure-outcome combination. Considering a significant (nonsignificant) association of NO2 (PM2.5 and PM10) exposure with breast cancer risk was detected, we can conclude that there is a “moderate level of evidence” for the association between NO2 exposure and an increased risk of breast cancer, whereas PM2.5 and PM10 exposure showed an “inadequate level of evidence.”

Comparison of systematic review protocols

The protocols of two previous systematic review and meta-analyses and those of our study were compared (Table S6). Our study provided a more comprehensive and updated evidence regarding the relationship of AAP exposure and the risk of breast cancer. Keramatinia et al. (2016) analyzed the correlation between NO2 exposure and breast cancer using the meta-analysis approach and reported a pooled correlation coefficient of 0.89 (95% CI = 0.84, 0.95) based on three ecological studies and two individual studies (one cohort and one case-control), while a more recent meta-analysis (Zhang et al. 2019) evaluated the effects PM2.5 and PM10 exposure on breast cancer incidence and mortality with fourteen studies included (7 studies on mortality and 7 studies on morbidity) and reported that both PM2.5 (OR per 10 μg/m3 = 1.17, 95% CI = 1.05, 1.30) and PM10 (OR per 10 μg/m3 = 1.11, 95% CI = 1.02, 1.21) exposures were associated with higher odds for breast cancer mortality, but not for breast cancer incidence. Keramatinia et al. (2016) applied the Meta-analysis of Observational Studies in Epidemiology (MOOSE) guideline for review of observational studies, while Zhang et al. (2019) and our study used the PRISMA guideline. The three studies aforementioned differed relatively in the indicators of definitions of exposure and outcome. In terms of assessing RoB of the systematic reviews, the NTP/OHAT guideline was adopted as a recommendation for the systematic review of environmental health exposure. However, only our study employed the guideline. Moreover, only our study used the GRADE evaluation approach which is also recommended by the Cochrane handbook to summarize the level of evidence for systematic reviews. By contrast, the two previous systematic review and meta-analyses did not adopt any approach to evaluate the level of evidence.

Discussion

Main findings

We intended to comprehensively evaluate the association of long-term AAP exposure with breast cancer incidence by synthesizing the effect estimates of 18 included studies from multiple countries. Statistically significant association was only found between NO2 exposure, and breast cancer incidence was found in the systematic review and meta-analysis, and evidence for such an association was appraised as “moderate” by the modified GRADE guideline. However, the results need to be explained with caution due to the moderate between-study heterogeneity among the studied air pollutant–breast cancer combinations. In this sense, future meta-analyses including studies with varied populations are needed to draw more robust and definitive conclusions. Meanwhile, synthesized evidence to date indicated little evidence to support an association of PM exposure with breast cancer risk. However, it is likely that individual constituents of PM (i.e., nickel and vanadium) as identified in the European Study of Cohorts for Air Pollution Effects (ESCAPE) project may be triggers for the risk of breast cancer (Andersen et al. 2017b) or other atmospheric polychlorinated biphenyls, such as congener 153 and benzo[a]pyrene (Amadou et al. 2021; Deygas et al. 2021). Therefore, more studies are warranted to better understand PM constituents or other atmospheric pollutants and their potential role in relation to breast cancer risk.

Definition of outcome and exposure assessment

Most of our included studies were sourced from administrative databases (i.e., national or regional cancer registry), and their outcome was defined using the ICD code. The routine collection from these databases is a representative and cost-effective way to explore the relationships of long-term AAP exposure with breast cancer incidence, but these types of databases may introduce some chances for outcome misclassification. Moreover, claim-based diagnosis code may be linked with socioeconomic status (SES) and comorbidities which are related to with pollutant concentrations, too. Breast cancer is likely to be diagnosed more frequently in women with higher SES compared to women of lower SES (Krieger et al. 2010). Although most included studies adjusted SES-related covariates, the differential misclassification of breast cancer due to SES cannot be totally ruled out.

The assessment of AAP exposure is extremely critical for estimating the effects of air pollution on human health in epidemiological studies. Some earlier studies usually assessed the AAP exposure based on the distance of participants’ residences to central air quality monitoring stations or major roads (Dockery et al. 1993). More recently, environmental epidemiological studies tend to capture personal AAP exposure level by using different estimation models (i.e., LUR model, spatiotemporal models) to address the spatial variability of air pollutant concentration. In our review, it was found that each of the included studies used its own exposure assessment method(s), which has its advantages and disadvantages. Herein, we propose that future studies use sophisticated estimation models, such as the dispersion model, LUR model, or satellite-based model to evaluate the relationship between long-term AAP exposure and breast cancer risk, especially when participants are sparsely inhabited.

Pathophysiological mechanisms

The underlying pathophysiological mechanisms for the relationship of long-term AAP exposure with breast cancer remain inconclusive, but some pathways have been hypothesized. It has been well-documented that AAP contains a mixture of several compounds, namely, gaseous pollutants, PM, metals, and organic compounds including benzene and PAHs. Some components of AAP, including PAHs, may act as endocrine disruptors, resulting in possible mechanistic links in breast carcinogenesis (Loomis et al. 2013). Moreover, NO2 exposure may lower the methylation of protumorigenic genes of ephrin type-B receptor 2 (EPHB2) and mitochondrial lon protease (LonP1), which both play important roles in breast cancer development (Gruzieva et al. 2017; Plusquin et al. 2017). This suggests a possible correlation between NO2 exposure and increased breast cancer risk. These fundamental pathophysiological activities may be more deleterious to breast tissue during the prenatal, pubertal, and pregnancy window periods of susceptibility to environmental toxicants when the breast tissue is changing in form and function (Lope et al. 2016; Sanderson et al. 2013). Environmental toxicants can exert their effects through alteration of epigenetic regulatory mechanisms (Callahan et al. 2018; White et al. 2016). The significant association of long-term ambient NO2 exposure with breast cancer incidence in our meta-analysis is in accordance with these proposed mechanisms.

Strengths and limitations

Our study undoubtedly has many strong points. First, it provides the most comprehensive evidence regarding the association of long-term AAP exposure with breast cancer incidence to date. The sample size of this study was substantial, and we assessed the effects of not only airborne PM (PM2.5 and PM10) but also airborne gaseous pollutant (NO2) on breast cancer incidence. Second, besides methodological quality assessment, we also assessed RoB for each individual study according to widely accepted and validated scales. Thus, our pooled results are indicative for peer researchers to identify the research gap and improve further study quality. Furthermore, the reliability of our summary effect estimates is certificate. We have conducted several sensitivity analyses by excluding any single study one by one to rerun the summary estimates or one specific study that included coronary patients as participants. Additionally, we entered the estimates from other individual air pollution exposure assessment method as a replacement if multiple exposure assessment methods were reported by one study and that the effects estimates were stable.

Admittedly, some limitations should be acknowledged in this study. First, study locations of the 18 included studies were of high homogeneity, which would limit the generality of the findings. They were all performed in high-income countries, but not in low- and middle-income countries, such as China, Pakistan, Afghanistan, Bangladesh, and India. Second, we were not allowed to assess potential additive or synergistic effects of correlated exposures in the context of multi-pollutant mixtures, since a majority of studies only reported estimations from a single-pollutant model. Third, we did not meta-analyze the associations of less-reported pollutants including NOx, SO2, and O3 with breast cancer incidence for the absence of existing studies, but we cannot rule out the potential associations of those two. Fourth, several studies have shown that risk estimates were stratified by menstrual status rather than overall risk estimates, and we synthesized them as an independent data set, and this may affect the independence of sample and resulted in a RoB. Fifth, nearly all of the included studies used a linear model to fit the air pollutant–breast cancer associations, but the possibility of non-linear associations cannot be avoided and should be explored in the context of the multi-pollutant study, because exploring the synergy between pollutants without considering the potential non-linear effects of each individual pollutant may lead to wrong conclusions. Sixth, it remains not clear at what point in the lifetime AAP exposure may be most relevant to breast cancer development. The exposure should be prior to the development of breast cancer by an appropriate interval, and this interval should be in accordance with the biologic understanding of the time from exposure to that of the observed disease. However, despite an increasing number of studies which have speculated that early (e.g., prenatal) exposures or other critical time windows (e.g., pregnancy or menopause) when breast tissue is changing in form and function and may alter the susceptibility of breast cancer, the length of exposure (or modeling) period ranged from 1 to 20 years of our included studies due to the limited data that often do not go back far enough in time to capture the relevant period. Therefore, the differences in evaluating the temporal relationship between long-term AAP exposure and breast cancer incidence among the studies represent a possible challenge to interpret the results. Seventh, publication bias was detected for the combinations of NO2 and PM10 with breast cancer incidence, which suggests that small studies showing nonsignificant effects could remain unpublished, and thus the true effect could be overestimated. However, the trim-and-fill method verified the robustness of our results. Last, the possibility for unmeasured/residual confounders may exist in our pooled effect estimates, although each included study had controlled the potential confounders in the analysis model. Some important confounders have not been considered because most of the included studies were based on the predesigned cancer registry data.

Conclusions

In summary, our study suggests that there is a “moderate level of evidence” for an association between long-term NO2 exposure and increased risk of breast cancer. However, in consideration of the limitations, further studies, especially from countries with higher AAP levels, are highly warranted to provide more comprehensive information for further understanding on the reasonability of air pollution and breast cancer association. These studies are recommended to improve individual air pollution exposure assessment, harmonize outcome of interest, as well as adjust confounding variables, so as to more benefit relevant policy-making.