Introduction

Traffic-related air pollution (TRAP) has been consistently associated with exacerbation of childhood asthma [1] and growing evidence supports an association with incident childhood asthma [25].

As part of the traffic related air pollution and childhood asthma (TRAPCA) international collaboration, individual estimates of air pollution exposure were assigned to children in four European birth cohorts [6, 7]. Using similar methodology, individual exposures were also assigned to children in two Canadian birth cohorts [8, 9]. To date, four of these cohorts have reported statistically significant associations between traffic-related air pollution and asthma or atopic disease during childhood [1013]; and one has reported associations between TRAP and wheeze [14].

Gene-environment studies are of special interest in the examination of childhood asthma because they are able to identify children most susceptible to the harmful effects of TRAP [2, 15] and identification of these interactions could provide biological plausibility for epidemiologic observations. Oxidative stress genes are of particular interest [16] but there have been limited studies that have examined the development of asthma. Carriers of a specific GSTP1 variant have been identified as a susceptible population in the association between TRAP and allergic sensitization [17], persistent wheeze [18], and asthma [19]. A common limitation in gene-environment studies is lack of sufficient power. To address this issue, and to improve our understanding of gene-environment interactions, investigators have called for analyses that combine data from studies with similar assessments of air pollution and asthma [20].

The Traffic, Asthma and Genetics study (TAG) has combined data from multiple birth cohorts to examine the influence of candidate genes related to oxidative stress and inflammation on the association between TRAP and the incidence of asthma, allergic rhinitis, eczema and wheeze in childhood. Here we describe the methodology used to pool data and provide information on the combined dataset and individual cohorts.

Methods

We included six birth cohort studies in TAG (Table 1): The Canadian Asthma Primary Prevention Study (CAPPS) [10], The Study of Asthma, Genetics and Environment (SAGE) [21], The Children, Allergy, Milieu, Stockholm, Epidemiological Survey (BAMSE) [14, 17, 22], The German Infant Study on the Influence of Nutrition Intervention plus Environmental and Genetic Influences on Allergy Development Study (GINIplus) [23], the Influence of Life Style Factors on the Development of the Immune System and Allergies in East and West Germany plus the Influence of Traffic Emissions and Genetics Study (LISAplus) [23], and the Prevention and Incidence of Asthma and Mite Allergy (PIAMA) Study [24]. Detailed information for all six cohorts including case definitions are provided in the Supplementary Material. Children were born during the mid-to-late 1990s and recruitment was done primarily through hospitals, clinics and outpatient practices. SAGE identified children born in 1995 from a healthcare registry and asthma and allergy phenotypes were diagnosed by a physician at age 8, at which time parent histories were recalled retrospectively. CAPPS is the only study that did not originally recruit a population-based sample. Although SAGE and BAMSE recruited population-based samples, the data available for TAG are based on nested case–control samples.

Table 1 Summary of TAG birth cohorts

In each cohort the following information was available for all or a subset of children: TRAP, assigned individually based on address at birth; assessment of physician-diagnosed asthma at 7 or 8 years; and available genotyping data for single nucleotide polymorphisms (SNPs) of primary interest. The studies recruited children primarily in urban areas. A primary objective of each study was examination of the epidemiology of childhood asthma. In CAPPS, GINI and PIAMA a portion of the population was assigned to a preventive intervention (education and counseling, promotion of hypoallergenic formula, and the use of dust mite-impermeable mattress covers).

Exposure assessment

For all cohorts except BAMSE, annual average NO2, as an indicator of TRAP, was estimated for each child’s birth address using land use regression models [6, 8, 9, 25]. For all study sites, integrated 14-day samples (Ntotal = 40–116) were collected. Potential predictors of traffic were screened by examining their correlation with measured air pollution and final models were assessed based on the coefficient of determination and root mean square error from cross-validation.

Models developed for the PIAMA cohort and for LISA and GINI children born in Munich were based on measurements collected between March 1999 and July 2000 [6]. The remaining LISA and GINI cities of Wesel and Leipzig were sampled in 2003 [26]. The model developed for CAPPS children born in Vancouver was based on measurements in the spring and fall of 2003 [9]; and the model developed for SAGE and CAPPS children born in Winnipeg was based on measurements in 2007 [8].

NO2 estimates for the BAMSE birth cohort were assigned to birth addresses using dispersion models [14, 27]; emission data for traffic-generated NOx were collected for the years 1990 and 2000. Pollutant dispersion was estimated using a dilution model based on wind speed, direction and precipitation [14]. Final models were validated using measurements taken outside the homes of 487 study children in the BAMSE cohort.

Ozone estimates were assigned to the European cohorts based on models developed in the APMoSPHERE project [28]. Predictions were made for the year 2001. In Canada, ozone estimates were assigned based on the average concentration among the three closest ambient monitors (within 50 km) using an inverse distance weighted approach.

Data transfer and creating a common database

Primary (asthma, wheeze) and secondary (allergic rhinitis, eczema, sensitization) outcome variables were available for all cohorts along with several potential confounders. Data were collected at different time points across the cohorts (Fig. 1), and there were slight differences in questionnaire wording and case definitions (see Supplementary Material). New TAG variables were derived from data common to all cohorts.

Fig. 1
figure 1

Follow-up time points for each cohort

For all cohorts, questions pertaining to physician-diagnosed asthma, allergic rhinitis and eczema were asked when the child was 8 years of age, with the exception of CAPPS (assessed at 7 years).

Asthma at any time during follow-up (‘ever’) and wheeze ‘ever’ variables were created using every available follow-up to the age of 8 years. If parents reported no asthma or wheeze at every follow-up and no more than one follow-up had missing data then children were coded as not having ever asthma/wheeze. Children in the referent group missing data for more than two follow-up periods were excluded.

Sensitization was assessed by skin prick testing at age 7 for CAPPS and SAGE and by RAST at age 6 for GINI and LISA and at age 8 for BAMSE and PIAMA (defined as any specific IgE antibody value of 0.35 kU/L or greater). Results are presented for outdoor (birch, dactylis, timothy grass, mugwort, ragweed, rye, trees, and weeds) and indoor (alternaria, cats, cladosporium, dogs, feathers, house dust mites, molds, and cockroaches) allergens.

Results

There are a total of 15,134 children in the merged dataset (11,760 with complete follow-up; Table 1): 11,720 children have complete data on wheeze, 10,202 children have complete data on asthma and 10,743 children have assigned NO2. NO2 was the only traffic pollutant available for every cohort. In the SAGE, GINI and LISA studies, NO2 was available only for children living in the urban centers of Winnipeg (SAGE) and Munich (GINI/LISA).

GSTP1 rs1138272 was available for 40 % of the combined dataset (21–94 % coverage by cohort), GSTP1 rs1695 was available for 44 % (30–97 %) and TNF rs1800629 for 39 % (20–93 %). Additional SNPs of interest for allergic rhinitis and eczema were available for 11.4–38.9 % of the combined dataset and coverage within each cohort ranged from 1.9 to 94.9 % (Table 2).

Table 2 Numbers and proportion of children with air pollution, birthweight and genotyping data

The proportion of children with asthma, wheeze, allergic rhinitis, sensitization and eczema are shown in Table 3. The Canadian and Swedish studies had the highest incidence and prevalence of asthma while the German and Dutch cohorts had the lowest. This is due in part to study design, since CAPPS recruited only high-risk children and the data used for SAGE and BAMSE were from nested case–control studies. The proportion of children with physician-diagnosed asthma reported at 7/8 years was 6.3 % and ranged from 2.4 % in LISA to 31.4 % in SAGE. The proportion of children with a physician-diagnosis of asthma ‘ever’ was 16.0 % and ranged from 6.1 % in LISA to 41.6 % in CAPPS. The proportion of children with ‘wheeze ever’ was 44.5 % and ranged from 37.2 % in GINI to 62.6 % in SAGE. Finally, the proportion of children with both physician-diagnosed asthma ever and wheeze at 6/7/8 years was 7.2 % and ranged from 3.7 % in PIAMA to 43.7 % in SAGE. Overall, 1,412 (14.1 %) children reported allergic rhinitis and 2,083 (30.5 %) were sensitized to at least one aeroallergen. Among those reporting a doctor diagnosis of allergic rhinitis with available information on sensitization, 63.3 % (655/1,035) were sensitized to at least one aeroallergen. A breakdown of important covariates by cohort is provided in Tables 3 and 4.

Table 3 Data on key variables, for pooled TAG data and by cohort (percentages are relative to the total children for given cohort within TAG)
Table 4 Summary statistics for birthweight and each pollutant, by study

NO2 distributions for Germany and The Netherlands were similar while those for Canada (SAGE) and Sweden indicate slightly lower mean concentrations (Table 4). For NO2 there was little overlap in concentration range between SAGE and the other cohorts. The within-cohort variation for NO2 in CAPPS and BAMSE was greater than the between-cohort variation (see Supplementary Material). In Canada, the within-cohort variation was due to the minimal overlap in the NO2 concentrations between the two centers of Vancouver (18.9–55.2 μg/m3) and Winnipeg (4.1–21.5 μg/m3).

Table 5 reports genotype frequencies for the pooled data and by cohort. For GSTP1 rs1138272, GSTP1 rs1695 and TNF rs1800629, heterozygous and minor alleles were more common in PIAMA and major alleles were more common in CAPPS. Allele frequencies for additional SNPs of interest for allergic rhinitis and eczema are also included in Table 5.

Table 5 Genotype frequencies by cohort

Discussion

TAG represents the first consortium to examine the interaction between candidate genes of oxidative stress and inflammation, and traffic-related air pollution in relation to incident childhood airway diseases. Our database provides an unprecedented opportunity for pooled analysis of a significantly larger sample than in previously published analyses. This also allows novel analyses examining the interaction between air pollution and genome-wide data, which have also been integrated into the TAG database.

Based on the literature, and availability of genotyping within each cohort, we obtained data on three SNPs postulated to modify the relationship between air pollution and asthma: rs1138272/1799811 (GSTP1), rs1695/947894 (GSTP1) and rs1800629 (TNF). Mutations in the glutathione S-transferase (GST) enzymes have been associated with asthma. The activity of GST in the lung is influenced by the GSTP1 enzyme [29] and this oxidative stress-modifying enzyme has been found to alter the response to air pollutants [30, 31]. Moreover, the GSTP1 rs1695 SNP may have a differential effect on the development of asthma according to age—an association has been found for early onset of disease but not for late onset [15]. TNF responds to inflammation markers and has been shown to modify the relationship between ozone and asthma [31].

NO2 is the pollutant with the most comprehensive coverage across the birth cohorts and is useful as a marker of within-city variability in exposure to traffic-related air pollutant. NO2 is a reasonable indicator of TRAP and has been a useful exposure marker in previous epidemiological investigations [11, 32, 33].

Traffic-related air pollution exposures were calculated as annual averages for the home address reported at birth, even though the measurements used to estimate TRAP were taken after birth for each of the cohorts. Recent findings [3436] suggest that it is reasonable to apply a land use regression model from one time point to other time points up to 7 years into the past, because the spatial distribution of these pollutants is generally stable over time.

The availability of individual data from each cohort allows for pooled data analysis within TAG. There is adequate variability across birth cohorts, and cities, in air pollution distributions, and the prevalence of asthma and SNP frequency to facilitate epidemiologic analyses. The higher prevalence of asthma within CAPPS (high-risk cohort), SAGE (nested asthma case–control) and BAMSE (nested wheeze case–control) provides additional power for pooled analysis [37] but the inclusion of cohorts with differing study designs warrant cautious interpretation of pooled estimates. For CAPPS and BAMSE, the within-cohort variation in NO2 is greater than the between-cohort variation, and supports the rationale for a pooled analysis versus a meta-analysis by cohort.

The main strength of TAG is the increased study power gained by combining data from multiple cohorts. However, this merging of data also carries some inherent limitations. While outcome and exposure variables across the European cohorts are comparable [38] the nonstandard definitions used for some potential confounder definitions may reduce precision of our estimates. A small number of potential confounders could not be included in our pooled dataset (mode of delivery, breastfeeding, parity, gas stove, visible mold and pets in the home) because it was not possible to harmonize data across each cohort. Asthma is defined as parent report of physician diagnosis in the European cohorts but is defined by a physical exam with a pediatric allergist in the Canadian cohorts. All pooled analyses will be replicated within each cohort to assess agreement between effect estimates [37]. This is another important strength of TAG because the consistency of effects across different populations can be examined using standardized methods. Further, children excluded from the pooled analysis due to insufficient air pollution data may have been more likely to live in rural areas, particularly within the SAGE cohort, and restricting to those with school age follow-up may have also resulted in selection bias.

These cohorts are unique in that they have highly detailed exposure assessment for TRAP and have recruited pregnant women or newborns and therefore have the ability to assess the development of asthma from birth. TAG comprises a rich database, the largest of its kind, for investigating the effect of genotype on the association between air pollution and childhood allergic disease.