Introduction

Tuberculosis (TB), caused by Mycobacterium tuberculosis (Mtb), remains a major public health threat globally, with a high burden in Sub-Saharan Africa. According to the World Health Organization, in 2011, Uganda’s TB incidence rate was 193 per 100 000 people, compared with 3.9 per 100 000 in the United States (http://www.who.int/tb/country/data/profiles/en/).

Exposure to Mtb initiates the first steps in the pathogenesis of Mtb infection and subsequent active TB. Tuberculin skin tests (TSTs) and interferon-γ (IFN-γ) release assays measure T-cell responses to Mtb and are utilized to identify Mtb-infected individuals. Infected individuals can remain healthy and without signs of active infection or disease (termed latent tuberculosis infection) or progress to active TB. Only about 10% of healthy adults with Mtb infection develop active TB. Notably, using the TST as a marker for Mtb infection, we have found that ~10% of individuals who are household contacts of patients with pulmonary TB remain uninfected for at least 2 years.1, 2 Our TB household contact study is unique in that it has rigorously characterized resistance to Mtb infection in the face of persistent exposure in both the household and TB-endemic community with a 2-year follow-up period.

Human genetic susceptibility is involved in the pathogenesis of TB, with most research focusing on immune response genes.3, 4 Previous research has shown that chromosomal regions linked to TB differed from those linked to resistance to Mtb infection.2 In this study, we examined this hypothesis further, by contrasting the results of two analyses: (1) the presence versus absence of active TB, and (2) resistance versus susceptibility to Mtb infection. Mtb-uninfected individuals are characterized by a persistently negative TST over an extended period of exposure and are referred to as resistors (RSTRs). Our previous work has shown that these persistently TST-negative individuals have equivalent epidemiologic risk profiles to those who have positive TSTs, including exposure to the index TB case and clinical characteristics.5 In that study, we found the primary predictor of RSTR was young age, and we hypothesized that host factors, such as genetics and innate immunity, likely also influenced the RSTR phenotype.

Numerous studies have informed our understanding of the role of host genetics in susceptibility to Mtb infection and disease. There are several classes of genes that are important for host responses to TB.6, 7 These include the Toll-like and Nod-like receptor families of genes (TLR1, TLR2, TLR4, TLR6, TLR9, TIRAP, TOLLIP, TICAM1/2, MyD88, NOD1, NOD2), cytokines and their receptors expressed by macrophages (TNF, TNFR1/2, IL1α/β, IL4, IL6, IL10, IL18, IL12A/B, IL12RB1/2, IFNG, IFNGR1/R2), genes expressed by T-cells (IFNG, IL4, IL12, STAT1, IL12RB1/2, IL10) and key TB candidate genes (SLC11A1, SLC6A3). Many genes in these pathways have been studied extensively in animal, macrophage and human studies and have shown varying degrees of association with TB, whereas others have not received much attention.3, 4, 6, 7

Typically, studies exploring TB and genetic risk factors for disease have focused on a few polymorphisms within a few candidate genes. As a field, it is critical to examine genetic influences for developing TB broadly, validate other genetic findings and avoid single candidate gene studies unless accompanied by validation and/or biology.8 In our current study, we have taken a comprehensive approach to the examination of genetic susceptibility to TB by investigating haplotype-tagging single-nucleotide polymorphisms (SNPs) in multiple candidate genes involved in innate and/or adaptive immune pathways that affect host responses to mycobacterial invasion. The objective of our current study was to examine the association between these candidate genes and pulmonary TB and RSTR phenotypes within the context of a TB household contact cohort. Finally, our inclusion of household contacts of all ages and regardless of HIV status allowed us to explore the hypothesis that pediatric TB is different from adult TB in its genetic risk profile9, 10, 11 and to explore the impact of HIV-infection on the TB genetic risk profile. The field of pediatric TB has been neglected, and this study provides a unique opportunity to examine the effects specific for children.

Results

Genetic association with TB

We first examined whether 546 haplotype-tagging SNPs in 29 immune pathway genes were associated with TB in 835 subjects from 481 families within 298 households (Table 1). Two hundred and forty individuals (28.7%) had TB (43% of the pediatric TB cases were culture positive, data not shown). The mean age was 18.43 (median=17) years, and 15% were HIV+. The percentage of HIV+ individuals within each group was similar, with 15% HIV+ in the TB analysis and 13% HIV+ in the RSTR analysis (data not shown).

Table 1 Sample characteristicsa

Genetic association analysis with pulmonary TB as the outcome of interest showed that two SNPs met the study-wide significance threshold, with 19 additional SNPs showing a nominally significant association (P<0.05; Table 2). The top SNPs in the TB analysis included one SNP within TICAM2 (aka TRAM) in the 5′ region, rs746566 (odds ratio (OR)=1.42, P=3.6 × 10−6) and one SNP in IL1B, rs1143643 (OR=1.99, P=4.3 × 10−5). Multiple SNPs were associated with TB at the nominal P<0.05 level in IL4 (best P=6.9 × 10−3), NOD1 (P= 9.4 × 10−3) and TOLLIP (P=6.8 × 10−3). Allele frequencies in cases and unaffected individuals for SNPs significant at nominal P<0.05 are provided in Supplementary Table S1, and results for all SNPs in TICAM2 and NOD1 are provided in Supplementary Table S2. To assess the impact of phenotype definition (both TST+ and RSTRs within the ‘control’ group), we conducted a sensitivity analysis, restricting the controls to only TST+ individuals. The trend in results remained the same, albeit with reduced significance, because of the reduced sample size (data not shown).

Table 2 Results of genetic association analysis of TB phenotype (SNPs with nominal P-values <0.05)

Although the association with IL1B has been reported in the literature before,12, 13 the associations with TICAM2 and NOD1 have not, so we sought to replicate these findings in an independent cohort. We obtained the Wellcome Trust (WTCCC) TB genome-wide association study data14 and examined SNPs in TICAM2 and NOD1 (Supplementary Table S3); this population (Gambia) is the same that previously showed an association with IL1B.12 Among the 42 SNPs in/near TICAM2 that passed quality control (QC), five showed significant association with TB with uncorrected P-value <0.05. The most significant SNP was rs1005551 with P=0.024 with adjustment for sex and tribe, which meets the threshold for independent replication.15 Among the 23 SNPs in/near NOD1 that passed QC, 4 were associated with TB with P-value <0.05 (Supplementary Table S4), with the most significant SNP being rs42603 with P=0.00096 adjusting for sex and tribe, also meeting the threshold for independent replication.

Examination of age-specific effects with TB

To assess whether genetic determinants of infection and disease were age dependent, we used a genotype–age interaction analysis. Our primary focus here was on the interaction term of the model, as main effects cannot be interpreted independently in models with interaction terms. Six genes showed an association with TB in children but not in adults (Table 3). The interaction term for rs2043055 (IL18 intron) attained suggestive significance (P=2.9 × 10−3), only one level of magnitude lower than the threshold for study-wide significance (P=2 × 10−4), and two additional SNPs approached this same level of significance. Association with IL18 was not observed in the sample as a whole (Table 2). In addition, three SNPs within TLR6 were suggestively associated with pediatric TB at this same level, with the most significant result at TLR6 3′ SNP rs5743832 (P=2.7 × 10−3). One SNP within IL1A, one within IL1B, five within STAT1, three within TLR6, two within IL12B, one within TLR4 and four SNPs within IL18 were nominally (uncorrected P<0.05) associated with pediatric TB.

Table 3 Genotype × age interaction analysis of TB

Genetic association with RSTR

We next examined whether the same set of SNPs was associated with RSTR in 718 individuals, including 75 individuals (10.5%) who were RSTR. None of the SNPs met the experiment-wide significance level in the analysis with RSTR as the phenotype (Table 4). However, 17 SNPs showed a nominal association, at the P<0.05 level. The top SNPs in this analysis included two SNPs in NOD1, two SNPs in NOD2 and three SNPs in SLC6A3. STAT1 was associated with RSTR in the sample as a whole, though it was associated with TB in the pediatric sample (Table 3). To make sure that HIV seropositivity did not influence the results (for example, anergy resulting in negative TSTs), we conducted a sensitivity analysis, excluding the HIV+ individuals from this analysis, and found no difference in significance (data not shown). In the age × genotype analysis for RSTR (Table 5), several SNPs in both IL12RB1 and IL12RB2 had significant interaction effects (P<0.01). These SNPs were associated with increased odds of RSTR in adults versus decreased odds of RSTR in children, or vice versa. Generally, these effects were only significant in adults or children.

Table 4 Results of genetic association analysis of RSTR phenotype (SNPs with nominal P-values <0.05)
Table 5 Genotype × age interaction analysis of RSTR

Discussion

Our study examined the association between 29 candidate genes involved in innate immune responses and two distinct phenotypes that result as a consequence of Mtb exposure: resistance to infection and pulmonary TB. We identified novel associations between pulmonary TB and TICAM2; to our knowledge, we are the first to observe associations between this gene and TB, and we replicated this finding in an independent data set. Moreover, we observed several SNPs with P<10−2 in NOD1 that were associated with TB. Although our results for NOD1 did not achieve significance after multiple testing correction, this is the first report of an association between TB and NOD1, which we also replicated in an independent cohort. In addition, we observed novel suggestively significant interactions between SNPs in IL18 and TLR6 and age; these SNPs were associated with TB in children aged ⩽10 years. Finally, we observed two SNPs in TOLLIP associated with TB (P<0.05), consistent with earlier findings.16

Three SNPs within the TICAM2 gene were associated with TB, with one SNP significant at the experiment-wide threshold. In addition, one TICAM2 SNP was nominally associated with RSTR. TICAM2, also known as TRAM, is a Toll-like receptor (TLR) adaptor that supports TLR4-mediated immune responses.17 In a recent study, TICAM2 levels predicted with 80% accuracy whether subjects would be high or low responders to the MVA85A TB vaccine candidate.18 Ours is the first study to find an association with TICAM2 genetic variants and TB. In addition, we replicated association with TICAM2 SNPs (P<0.05) in the WTCCC data.14 Though our most significant SNP did not replicate, this may be due to differences in population genetic differences such as LD patterns and/or differences in ascertainment of cases and controls, as well as the design of the genotyping arrays (see Supplementary Material for detail);8 a nearby TICAM2 SNP, rs17473484, which is ~7 kb away, showed P=0.034, and another rs10055514, ~51.5 kb away, showed P=0.039.

We observed a statistically significant association between TB and IL1B, more significant than in previous reports and in intronic rather than exonic variants.12, 13 Intronic SNPs in IL4 were also associated with TB. This is the first report of an association of IL4 polymorphisms with TB in an African population and replicates studies of IL4 in TB in non-Africans.19, 20 Our greater SNP density and use of haplotype-tagging SNPs allowed us to detect these genetic association effects.8, 21 This greater coverage of genetic variation may explain why we achieved greater significance than in previous reports.12, 13

We investigated children aged ⩽10 years based on reports of age-specific genetic effects for TB,9, 10 differences in immune responses of children compared with adults22 and unique epidemiological risk profiles for Mtb infection in children.5 We found an association between TB and IL18 and TLR6 in children and suggestive associations between TLR4 and IL12B and pediatric TB. As most TB genetics studies focus on adults, this may explain why associations between TB and IL18 have not been reported before. Interleukin 18 (IL18), similar to IL1β, is a pro-inflammatory cytokine that requires activation of the host cell inflammasome for secretion in its mature, bioactive form.23 Mature IL18 has a role in development of T helper type-1 type immune responses and, with IL12, regulates IFN-γ production by T cells and natural killer cells.24 Although IFN-γ and IL1β are considered essential for the control of Mtb, the role of IL18 in immune responses to Mtb remains unclear. Some murine models have demonstrated a protective role for this cytokine following in vivo Mtb infection,25 and human in vitro studies suggest that IL18 synergizes with IL12 to provide optimal control of Mtb in human macrophages.26 The only previously reported association between IL18 and TB came from a meta-analysis of Chinese studies.27

The association between genetic variation in TLR6 and TB has been investigated in a few prior reports. A meta-analysis of four study populations (three ethnically diverse populations in the United States and an Indian population) showed modest association between a TLR6 polymorphism and TB, though these populations were presumably all adults.28 In young infants, TLR6 polymorphisms have also been associated with altered BCG-specific cytokine responses,29 particularly post-BCG vaccination.30 The causal SNP implicated by Randhawa et al.,30 rs5733810, is in moderate linkage disequilibrium (LD) with rs5743812 in Kenyan HapMap data. We observed association between rs5743812 and pediatric TB but did not genotype those two SNPs, so we cannot examine LD in the Ugandan population. Furthermore, we did not observe association with TLR1, which is in strong LD with TLR6 in certain populations;31 given the lower LD seen in the Ugandan population32 and non-significant association with TLR1, these effects are likely due to TLR6 alone. Previously, we have detected signatures of natural selection in TLR6 in Ugandans,32 suggesting that this gene may be important in infectious disease susceptibility. Regarding the contribution of TLR6 to innate control of Mtb infection, there has been one report demonstrating that recognition of Mtb by TLR2/TLR6 heterodimers contributes to activation of the host cell inflammasome, caspase-1 activation and subsequent production of mature IL1β.33 As children aged ⩽10 years are more likely to experience their first exposure to Mtb than adults (in TB-endemic settings), genetic susceptibility to TB may differ whether the host has preexisting cumulative immune sensitization to Mtb. Given the borderline P-values of some of our findings, our conclusion that they reflect unique age-based genetic susceptibility to TB may be premature. Our findings emphasize the importance of including children in genetic susceptibility studies, especially for diseases such as TB where disease risk and phenotype change as children grow older and their immune systems mature.

Though not significant at the experiment-wide threshold, SNPs from both NOD1 and NOD2 were associated with TB and the RSTR phenotype, respectively. One study in a Chinese population identified a single SNP in NOD2 gene associated with TB susceptibility,34 although we observed an association between this gene and RSTR. NOD2, a cytosolic pattern recognition receptor, has been implicated in recognition of Mtb products that are secreted from the macrophage phagosome into the cytosol. Thus NOD2 may have a role in activation of the host cell inflammasome with subsequent production of mature IL1β and IL18.33, 35, 36 Ours is the first study to report associations between NOD1 and TB, and we have replicated this finding in the WTCCC study data. Even though the NOD1 SNPs did not achieve experiment-wide statistical significance, it is noteworthy, because this is the first report of a possible role for NOD1, and no other studies have examined genetic influences on RSTR.

Although many studies designed to uncover genetic associations with TB focus on TB, few have explored the genetic association or genetic linkage with the TST− phenotype.2, 37 As most studies do not include TST in the characterization of non-diseased individuals,8 there is usually no assessment of the unaffected subject’s exposure and/or infection with Mtb. Our use of data from a longitudinal household contact study not only provides opportunity to collect follow-up data but also confirms Mtb household exposure of all study participants.38 The RSTR phenotype is of special interest as these individuals do not appear to become infected by Mtb over a 2-year period, despite heavy exposure to an individual with active pulmonary TB and residence in a high TB-endemic area.5 Though we did not find any SNPs to be significantly associated with the RSTR phenotype at the P<2 × 10−4 (study-wide α=0.05) level, we did find a nominally significant association with three SLC6A3 SNPs. This finding replicates the cross-sectional study by Cobat et al.,37 conducted in South Africa, that associated SLC6A3 with TST reactivity. Because we observed nominal associations between various genes and TB and not with RSTR, this further suggests that these distinct clinical outcomes are regulated by different genetic mechanisms. It is possible that we did not detect significant genetic associations with the RSTR phenotype, because the vast majority of RSTRs were young children, and the age-specific models may have been underpowered to detect an effect. Larger cohorts will be needed to more closely examine this trait. Finally, the impact of HIV on the characterization of RSTR is not well known. TST positivity is defined using a lower threshold for HIV-positive individuals, and in our previous work, we saw no difference in the distribution of HIV in RSTRs versus non-RSTRs.5 Because most of these study subjects were enrolled before CD4 counts were done in HIV-positive individuals (before 2004), we are unable to evaluate the impact of low CD4 and potential anergy in the RSTRs. Only four of the RSTRs were HIV+, so possible anergy likely had little influence on our findings.

Interestingly, we only observed one SNP within the 3′ region of the SLC11A1 gene (aka NRAMP1) that was associated with TB, and it did not achieve experiment-wide statistical significance (P=0.026). SLC11A1 has been associated with TB in meta-analyses,39, 40, 41 so the lack of statistically significant associations might be surprising. Non-replication could be due to study design, including differences in diagnostic criteria for TB cases and controls and issues of targeted polymorphisms versus comprehensive LD coverage.8, 15 Another possible explanation for our weak association between TB and SLC11A1 could be due to interactions between SLC11A1 and other genes, where TLR2 acted as a modifier of SLC11A1-associated TB risk.42

Our findings are limited by our sample size and the fact that we had no Ugandan replication sample. Despite these limitations, we identified significant and novel associations between SNPs in immune response genes and TB, such as TICAM2, NOD1 and IL1B, as well as pediatric TB-specific effects for IL18 and TLR6. Our findings warrant further study with a larger sample size. Our candidate gene, hypothesis-based approach, as opposed to a genome-wide analysis, may have prevented us from observing additional genes significantly associated with the RSTR phenotype, so further work is needed. Our age-based analysis suggests that genetic susceptibility for TB in adults and preadolescent children may differ and warrant further investigation in a larger cohort of Mtb-exposed children.

Materials and methods

Study participants

Data used in this analysis were gathered from two phases of a household contact study conducted in Kampala, Uganda. Subjects from the Household Contact Study were enrolled from 1995 to 1999,43 while subjects from the Kawempe Community Health Study were enrolled from 2002 to 2008.38 The study protocol was reviewed and approved by the National HIV/AIDS Research Committee, The Uganda National Council of Science and Technology and the institutional review board at the University Hospitals Case Medical Center, Cleveland, OH, USA. Individuals who presented at the study clinic with active culture-positive pulmonary TB were enrolled as index cases. All household members who provided informed consent were also enrolled and evaluated at study entry with TST, HIV testing, chest X-ray and a history and physical exam for signs and symptoms of TB. Healthy household contacts underwent a follow-up evaluation every 3 months for the first 6 months, then every 6 months thereafter. Diagnosis of TB for this analysis was based on isolation of Mtb from clinical samples (sputum or gastric aspirates) of all adult patients and the many pediatric cases (44% of those in this analysis)44 at any time during the study period. There were no individuals with disseminated TB (TB meningitis or miliary TB) included in this analysis. RSTR individuals were defined as having TSTs that remained negative throughout the 2-year follow-up period. A positive TST was defined by induration at the injection site >5 mm for children aged ⩽5 years or HIV-infected individuals and >10 mm for all others; the 10-mm cutoff is used in settings where BCG vaccine coverage is high.5, 45

Genotyping

In our analysis, we focused on 29 genes involved in the tumor necrosis factor, IL, TLR/NLR and IFNG/IL12 pathways, genotyping 546 haplotype-tagging SNPs within these genes. Tag SNPs were selected to capture common genetic variation (minor allele frequency ⩾5%) with strong coverage (LD r2⩾0.8) in any of the three African HapMap populations, based on our previous work,32 and were identified using the Genome Variation Server (http://gvs.gs.washington.edu/GVS137/index.jsp). Genotyping was conducted using the Illumina iSelect platform (San Diego, CA, USA). Once SNPs were selected using the Genome Variation Server, their availability on the iSelect platform was verified; if a specific SNP was not available on iSelect, a nearby SNP was selected to replace it. Genotype calling and QC was performed using Genome Studio, filtering the SNPs by call frequency, replicate errors and clustering quality (AB R Mean, AB T Mean); 14 SNPs were removed in this process. Self-reported family relationships were confirmed using genetic data and corrected where needed.

Statistical analysis

Sample allele frequencies were calculated adjusting for family structure by means of the maximum-likelihood approach implemented in FREQ, part of the S.A.G.E. package.46 Genetic association analyses were conducted by logistic regression using generalized estimation equations (GEE) to account for genetic relatedness within households, as implemented in the R package gee. Observations were clustered by subfamily, defined as groups of first-degree relatives living within a household. Genetic association analyses were conducted separately to examine two distinct phenotypes: active TB (versus absence of active TB) and RSTR (versus susceptibility to Mtb infection); TST+ individuals without active disease were included in the control group for both analyses, and RSTRs did not have active TB by definition. Each subject had only one clinical classification (RSTR, TST+ or TB). Genotypes were coded as both additive and dominant genetic models, using the minor allele as the effect (‘risk’) allele. Recessive models were not tested, because the rare allele homozygote was usually too infrequent for the models to be reliable. Sex and HIV status were included as covariates in all the analyses. An exchangeable correlation matrix was used in the GEE model, except where the minor allele was too rare for the exchangeable model to converge to a maximum, in which cases an independence model was fitted. A single-SNP P-value of 2 × 10−4, corresponding to a study-wide significance threshold of α=0.05, was determined by estimating the number of independent tests based on LD among the SNPs passing QC47 using the program SNPSpDlite (http://gump.qimr.edu.au/general/daleN/SNPSpDlite/).

We also conducted an analysis including an age × genotype interaction term to explore age-specific genetic effects, where age was a binary variable with age ⩽10 years. This age cutoff was based on similarity of epidemiological risk factor distribution within children ⩽10 years of age compared with older children and adults.5 When the interaction term was significant, we conducted stratified analyses (separate models for age ⩽10 and age >10 years) to evaluate whether the significant genetic effect was in the children, adults or both. Similarly, we conducted an HIV × genotype analysis, based on our earlier observation that HIV seropositivity may have a synergistic genetic effect on TB risk;48 these analyses were restricted to the TB phenotype, because there were too few HIV-infected individuals who were RSTR. Results did not attain statistical significance in the HIV–genotype interaction models (Supplementary Table S5).