Introduction

Leprosy is an infectious disease caused by Mycobacterium leprae that, in 2012, affected ~230,000 new individuals worldwide (WHO 2014). The disease compromise mainly the skin and peripheral nerves (Britton and Lockwood 2004) and can lead to severe disabilities. M. leprae is well adapted to the human host and exhibits very low variability across different isolates (Monot et al. 2009). Also, comparative molecular analysis of DNA samples recovered from preserved corpses from endemic European countries in the 12 and 13 centuries demonstrated that the genomic architecture and variability of M. leprae did not change significantly over the past 1,000 years (Schuenemann et al. 2013). M. leprae is not highly infectious since only a small proportion of the exposed individuals are infected and, among these, even fewer individuals progress toward clinical disease. Within this subset of patients, leprosy may present either as a localized or disseminated disease. Today, it is well accepted that human genetic variability in genes involved with the regulation of host immunity is crucial to determine both susceptibility and progression toward clinical forms of leprosy (Alter et al. 2011). In fact, genome-wide linkage scans (Alcais et al. 2007; Mira et al. 2003; Siddiqui et al. 2001), case–control studies of candidate genes (Cardoso et al. 2011a; Pereira et al. 2009) and genome-wide association studies (GWAS) (Wong et al. 2010; Zhang et al. 2009) have been contributing to an increasing list of genes associated with leprosy. Validation and replication studies in different populations, although not common, are mandatory to finally pinpoint the major genes/pathways controlling leprosy phenotypes, ultimately leading to an improved understanding of the influence of genetic host variations in susceptibility or resistance to the disease (Cardoso et al. 2011b).

The first leprosy GWAS was conducted in a Chinese population sample, variants located at CCDC122-LACC1 (the second, formerly known as C13orf31), NOD2, TNFSF15, HLA-DR-DQ and RIPK2 were associated with the disease and trend toward association was observed for LRRK2 (Zhang et al. 2009). A subsequent study using a Mali and a New Delhi population sample validated the CCDC122 and LACC1 associations (Wong et al. 2010). A family-based validation study conducted in Vietnamese families re-tested all 16 SNPs associated with leprosy in the Chinese original GWAS: 6 of them—located at CCDC122-LACC1, NOD2, RIPK2 and the HLA-DR-DQ loci—were replicated (Grant et al. 2012). The NOD2 gene was also associated with leprosy per se and leprosy reactions when tested in Nepal (Berrington et al. 2010).

Here we investigated whether non-HLA genes originally described in the Chinese GWAS are associated with leprosy among Brazilians. Our stepwise design involving five population samples from different Brazilian regions resulted in positive association between leprosy and two genetic markers located at the NOD2 and CCDC122-LACC1 loci.

Methods

Ethics statement

All methods and procedures used in this study were approved by the local ethics boards and the Brazilian National Board for Ethics in Research. A written informed consent was obtained from all study participants.

Subjects and study design

First, we investigated all four candidate loci (five non-HLA genes identified previously in the Chinese GWAS: CCDC122-LACC1, NOD2, TNFSF15 and RIPK2) in a family-based sample recruited at the Prata Village, a former leprosy colony located at the state of Pará, north of Brazil. This village was founded in the early 1920 with the objective to isolate individuals affected by leprosy. Isolation was compulsory until 1962; however, to date, the population remains highly isolated and present unique characteristics, such as very high disease frequency and homogenous distribution of socioeconomic and environmental variables (Lazaro et al. 2010; Werneck et al. 2011). A very strong genetic effect controlling susceptibility to leprosy has been described for the Prata population (Lazaro et al. 2010), making it suitable for genetic association studies on leprosy. The Prata sample is composed of 179 individuals distributed in 60 nuclear families, from which 67 trios (one leprosy-affected individual and both parents) were derived.

Then, we used a stepwise strategy to investigate the associated markers from Prata in four replication samples from Brazil, totaling 3,435 individuals: three case–control samples, including 1,601 leprosy cases and 1,387 controls, from Rio de Janeiro–Rio de Janeiro; Bauru–São Paulo and Rondonópolis–Mato Grosso, and an independent family-based sample from Almenara-Minas Gerais, composed by 447 individuals distributed in 125 nuclear families from which 147 trios were derived. When necessary, siblings were used to infer the genotype of an absent parent.

Patients were classified according to the classic, five-group classification system (Ridley and Jopling 1966), and were treated following the World Health Organization recommendation, as paucibacillary or multibacillary. In Rio de Janeiro and Bauru, blood donors were used as controls; in Rondonópolis, controls were recruited during campaigns of active search for new leprosy cases performed at military bases and universities. In all contexts, controls were unrelated and from the same geographical region as cases, and presented no documented history of chronic infectious or inflammatory diseases. The ethnicity of each subject was classified as Black, Caucasian or Mestizo according to morphological characteristics of the individual and his/her family. The description of demographic and clinical characteristics of these samples is summarized in Table 1 and described in detail elsewhere (Marques et al. 2013).

Table 1 Demographic and clinical characteristics of the individuals included in the family-based and case–control association studies

SNP selection and genotyping

Tag SNPs markers capturing the entire information of each candidate gene (from Chinese GWAS) were defined according to the information available at the International HapMap Project using the following parameters: minor allele frequency of 0.05 in the YRI population (Yoruba in Ibadan, Nigeria), tagger multimarker method, and r 2 cutoff of 0.8. Following this strategy, 36 markers were interrogated at the four loci, as follows: 13 markers at CCDC122-LACC1, 7 markers at NOD2, 8 markers at RIPK2 and 8 markers at TNFSF15.

Genomic DNA was extracted from peripheral blood by classic salting-out (John et al. 1991). Genotyping was performed by fluorescence-based allelic discrimination using TaqMan, as implemented in the Applied Biosystems StepOnePlus Real-Time PCR System platform.

Statistical analysis

Family-based association analysis was performed using the Transmission Disequilibrium Test (TDT), as implemented in the FBAT software, version 2.0.2 (Horvath et al. 2001). We applied the empirical variance (-e) function to allow for association testing in the presence of linkage, an appropriated approach when multiplex families are used (Lake et al. 2000). Deviations from Hardy–Weinberg equilibrium and linkage disequilibrium (LD) estimations (Prata Village) were performed using the Haploview software, version 4.2 (Barrett et al. 2005). To test for independence of positive association signals in the Prata sample, stepwise logistic multivariate regression analysis (Schaid and Rowland 1998) was performed as implemented in the SAS software version 9.1.

Comparative analyses for allelic, genotypic and carrier frequencies among cases and controls were carried out using an unconditional logistic regression model as previously described (Cardoso et al. 2011a; Marques et al. 2013). Analysis was performed using R for Windows (R Development Core team 2013) version 2.10.1, with the package ‘‘genetics’’. An overall analysis combining the case–control samples was performed controlling for possible confounding effects using the geographic region of the population sample, gender and ethnicity. In addition, we have integrated our TDT and case–control studies to obtain an overall OR estimate as suggested by Kazeem and Farrall (2005), using the package “catmap” in R environment.

Results

Allele frequencies were in Hardy–Weinberg equilibrium in all population samples included (data not shown) and the genotyping success rate was ≥95 % for the tested markers. Seven out of the 36 SNPs genotyped in the primary sample of trios from the Prata Village were excluded from the analysis due to complete homozygosis (rs5743270, rs16900581, rs16900592, rs16900593, rs11995005, rs16931739, rs6478107). In addition, marker rs17065164 from CCDC122 was not analyzed due to the low number of informative families in the discovery sample (<10). Among the remaining 28 markers tested for association with leprosy per se in the Prata Village, 23 were not associated (Table S1). Three alleles of NOD2 markers—rs8057341-A, rs2111234-G and rs3135499-C—and two at CCDC122-LACC1—rs4942254-C and rs2275252-A—were under-transmitted to affected offspring, indicating leprosy protection (Table 2). Out of these five, two NOD2 markers (rs8057341 and rs3135499) were also associated in the original Chinese GWAS.

Table 2 Association between leprosy per se and markers at CCDC122-LACC1 and NOD2 genes in a family-based study from Prata Village

Linkage disequilibrium analysis of the associated markers suggested the existence of one single association signal in each loci (Fig. 1): there is moderate LD between NOD2 marker rs8057341 and both rs2111234 (r 2 = 0.59) and rs3135499 (r 2 = 0.36); marker rs4942254 of CCDC122-LACC1 is in strong LD with rs2275252 (r 2 = 0.93). This effect was confirmed by stepwise, logistic multivariate analysis: for each gene, when all associated markers were included in the model, association remained significant only for rs8057341 and rs4942254 of NOD2 and CCDC122/LACC1, respectively; therefore, these two markers were selected for further analysis in the replication samples.

Fig. 1
figure 1

Relative position and linkage disequilibrium plot (LD) patterns of markers for the coiled-coil domain containing 122 gene (CCDC122) and laccase (multicopper oxidoreductase) domain containing 1 gene (LACC1) in Prata Village sample (a) and nucleotide-binding oligomerization domain containing 2 gene (NOD2) (b). Values inside boxes represent LD measured using the r 2 parameter and the intensity of shading is proportional to r 2. *SNPs associated in the Chinese GWAS (Zhang et al. 2009)

Among the three genetic models tested in our case–control studies (genotypic, allelic and carriers), the genotypic was the best model to capture the differences between cases and controls in all populations. The replication of association between NOD2 rs8057341 and leprosy was observed in all case–control samples, with the genotype “AA” conferring resistance to leprosy (Table 3). In the family-based Almenara sample, however, the allele rs8057341-A did not reach statistical significance (P = 0.20, Table S2). A combined analysis including all case–control studies endorsed the protective effect of rs8057341-AA against leprosy (ORAA = 0.49, P = 1.39e−06, Table 3). Finally, to obtain an overall estimate, all samples (case–control and family-based studies) were included to build a summary plot that indicated a consensus protective OR value (overall ORA allele = 0.80, P = 0.0001), confirming allele “A” of NOD2 rs8057341 as a leprosy resistance genetic factor (Fig. S1a).

Table 3 Genotype frequencies for the rs8057341-NOD2 and rs4942254-CCDC122/LACC1 SNPs in case–control groups and logistic regression results for association with leprosy per se in Rondonópolis, Bauru and Rio de Janeiro populations and combined analysis including the three studies

The genotype “CC” of rs4942254 at CCDC122-LACC1 was also associated with leprosy resistance in two of our replication samples: Rondonópolis and Rio de Janeiro (Table 3). However, no association was observed for this marker in Bauru and Almenara. The combined analysis of rs4942254 revealed association between the CC genotype and leprosy per se (ORCC = 0.72, P = 0.003, Table 3). The summary plot including all studies resulted in a global OR consistent with a protective effect of CCDC122-LACC1 allele rs4942254-C (ORC allele = 0.86, P = 0.003) against leprosy (Fig. S1b).

The LD plots for genes RIPK2 and TNFSF15 are available in Fig. S2. As a remark, the SNPs associated with leprosy in the Chinese GWAS are also indicated in the LD figures.

Discussion

Genetic risk factors for complex traits have been intensively investigated and candidate genes have been proposed for several common diseases, including leprosy. The first GWAS in leprosy (Zhang et al. 2009), performed in a Chinese sample, identified new genes (CCDC122-LACC1, NOD2, TNFSF15, HLA-DR, RIPK2 and LRRK2) and pathways that encouraged validation and replication studies in other populations of distinct genetic backgrounds. A study involving an Indian and an African sample population validated only the association between leprosy and variants of the CCDC122-LACC1 locus (Wong et al. 2010). In contrast, when a Vietnamese sample population was investigated for the same genes and markers, only LRRK2 and TNFSF15 associations were not replicated (Grant et al. 2012). These conflicting results reinforce the importance of additional validation and/or replication studies using independent population samples. In this scenario, we sought to validate the Chinese results, first using a sample from a unique family-based sample from the Prata Village, located in the Brazilian amazonic state of Pará. Our assumption is that, due to its history as a former isolation colony, the Prata population is enriched of leprosy susceptibility genetic variants which, combined with very homogenous demographic, socioeconomic, environmental and educational variables (Lazaro et al. 2010), makes it suitable for genetic association studies in leprosy. The small Prata sample size, however, poses an obvious limitation; thus, to confirm the observations, we applied a four-stage replication strategy using one family-based and three independent case–control samples from different regions of Brazil.

As a result, we have identified two polymorphisms at genes NOD2 and the CCDC122-LACC1 locus consistently associated with host resistance to leprosy. Up to now, the association between NOD2 and leprosy susceptibility originally reported in Chinese has been validated in Nepalese and Vietnamese population samples (Berrington et al. 2010; Grant et al. 2012; Zhang et al. 2009), but not in Indians and Africans (Wong et al. 2010). Data from the Chinese study indicate the G allele associated with increased leprosy risk (Zhang et al. 2009). Here, allele A of NOD2 rs8057341 was found associated with host resistance to leprosy in all the samples studied. The replication of the association signal for marker rs8057341, with the same resistance allele on all our population samples, argues consistently in favor of NOD2 as leprosy per se susceptibility gene. Interestingly, in Vietnam, NOD2 rs8057341 was not associated with leprosy (Grant et al. 2012); however, the same study reported NOD2 marker rs9302752 associated with the disease, which may indicate a distinct LD profile across these populations, a hypothesis supported by the HapMap data—LD between rs8057341 and rs9302752 of r 2 = 0.77, 0.41 and 0.00 in the CEU, CHB and YRI populations, respectively (International HapMap 2005).

A second consistent association signal was observed for rs4942254, which is located intragenic to CCDC122; however extensive LD pattern does not allow excluding neighboring gene LACC1 as the true responsible for the association detected. Finally, a combined plot was conducted to summarize the information from all samples of the present study. The results confirmed the host resistance effect for both loci.

The conflicting results obtained in leprosy association studies may reflect biological differences associated with population-specific genetic effect (Manry and Quintana-Murci 2013). The increased ethnic proximity between Vietnamese and Chinese may explain the higher rate of successful validation observed among these populations (Grant et al. 2012). Differences in allele frequency and haplotype/LD structure reflect ethnic specificity; thus, the association pattern identified in Chinese population may not be captured in different populations. Also, in the present study, it is important to consider that the small sample size of the discovery sample could have an impact upon the power to capture more subtle genetic association effects. Finally, we cannot exclude the possibility that genes not validated/replicated for leprosy per se susceptibility are actually controlling susceptibility to endophenotype of the disease, such as clinical forms and the occurrence of reversal reactions.

Zhang and cols (Zhang et al. 2009) identified a pathway placing five leprosy susceptibility genes (LRRK2, NOD2, RIPK2, HLA-DRB1 and TNFSF15) within the same biological pathway that included PARK2, previously associated with leprosy (Mira et al. 2004). Several of these genes have been implicated with host immune response in different infectious diseases (Schurr and Gros 2009; Zhang et al. 2011). The NOD2 gene encodes an intracellular sensing molecule that recognizes a component of mycobacterial wall. Upon recognition, the NOD2-mediated signaling pathway promotes the recruitment of RIPK2 and formation of a NOD2–RIPK2 complex that indirectly leads to activation of NF-κB as a part of the host immune response to infection (Schurr and Gros 2009; Zhang et al. 2009, 2011). A functional study reinforced the importance of the NOD2 cascade in leprosy by demonstrating that the interaction of NOD2 with muramyl dipeptide, a mycobacterium cell wall component, leads a distinct interleukin-32-dependent induction, resulting in the differentiation of monocytes into dendritic cells (Schenk et al. 2012). It has also been shown that NOD2 is able to induce autophagy, a crucial mechanism for intracellular bacterial clearance (Cooney et al. 2010). In contrast, the function of CCDC122-LACC1 locus is yet unknown. Remarkably, our data add up to the accumulating body of evidence indicating a common association fingerprint across leprosy, Crohn’s and Parkinson’s disease (Orlova et al. 2011): variants of leprosy susceptibility genes PARK2, TNFSF15, NOD2, LACC1, LRRK2, IL23R, IL18RAP/IL18R1 and IL12B have been described also associated with Crohn’s, Parkinson’s and inflammatory bowel disease (Liu et al. 2012; Trabzuni et al. 2013; Zhang et al. 2009, 2011). It is possible to speculate that a better understanding of the genotype–phenotype regulatory switches controlled by these associated SNPs can help develop novel diagnostic and therapeutic approaches for infectious, inflammatory and neurodegenerative diseases.