Introduction

Aetiology of rheumatoid arthritis (RA) remains unclear, but that the autoimmunity in RA is represented by the presence of ‘rheumatoid factor’ (RF) was recognized in patients with RA over 50 years ago (Rose et al. 1948). The RF assay, in its current manifestation, remains suboptimal as a diagnostic test, as it lacks sensitivity (54–88%) and specificity (48–92%) (Weinblatt and Schur 1980; Schellekens et al. 2000; Bizzaro et al. 2001; Bas et al. 2002; Saraux et al. 2002). In contrast, the identification of anti-citrullinated peptide antibodies (ACPA) and the advancement of commercial tests based on recognition by anti-cyclic citrullinated peptide (anti-CCP) antibodies seem to play a pivotal role in the pathogenesis of RA as they are highly specific (Schellekens et al. 2000); RA can be detected years before the onset of symptoms (Berglin et al. 2004; Nielen et al. 2004).

Genetic contribution to RA pathogenesis has been predicted to be ∼60%, and the human leukocyte antigen (HLA) region has consistently shown the strongest genetic association with RA (MacGregor et al. 2000). Several studies have shown that shared epitope (SE) alleles are associated with anti-CCP-positive RA but not with anti-CCP-negative RA (Huizinga et al. 2005; Verpoort et al. 2005). Almost 30 years after the designating of HLA alleles as a risk factor for RA, non-HLA genes within the major histocompatibility complex (MHC) have also been examined for association with RA. The genes for tumour necrosis factor (TNF) lie within the MHC and have been a focus of intense interest, given the affirmation that TNF-α plays a central role in the inflammatory cascade in affected joints and the striking efficacy of TNF-α antagonists as therapeutic agents. Results of such analysis, some studies report an association of particular TNF markers with RA (Danis et al. 1995; Mulcahy et al. 1996; van Krugten et al. 1999; Tuokko et al. 2001; Waldron-Lynch et al. 2001), while others find no differences (Wilson et al. 1995; Field et al. 1997; Vinasco et al. 1997; Yen et al. 2001; Low et al. 2002). However, it is likely that HLA-DRB1 is not the only RA susceptibility gene in the MHC domain.

A large number of ratified associations with RA susceptibility genes apart from MHC have been reported lately. Since 2007, there has been an explosion in the number of RA susceptibility genes identified and confirmed in well-powered cohorts (Begovich et al. 2004; Bowes and Barton 2008; Coenen and Gregersen 2009; Kochi et al. 2009; Plant et al. 2010). It was not until 2003, nevertheless, that the gene peptidylarginine deiminase type 4 (PADI4) was identified in a Japanese population as a second risk factor for RA (Suzuki et al. 2003). The discovery of PADI4 as a risk factor was followed by the introduction of protein tyrosine phosphatase non-receptor type 22 (PTPN22) in 2004 (Begovich et al. 2004; Carlton et al. 2005; Gregersen 2005). A year later, in 2005, cytotoxic T lymphocyte-associated antigen 4 (CTLA4) was found during a candidate gene analysis (Plenge et al. 2005). In 2007, by means of a candidate gene advent (Kurreeman et al. 2007), a novel genetic risk factor was identified in the 9q33 region of the genome containing TRAF1/C5; it was also detected concurrently in a genome wide study (Plenge et al. 2007a, b). In 2007, the signal transducer and activator of transcription (STAT4) gene region on chromosome 2q gene was associated with RA pathogenesis (Remmers et al. 2007). SNPs in cluster of differentiation 244 (CD244) were found to be associated with susceptibility to RA and systemic lupus erythematosus in a Japanese cohort (Suzuki et al. 2008), but not in Korean population (Cho et al. 2009). In the light of these findings, the role of genetics in RA is explored in this article.

Evidence supporting a genetic component in RA

There is extensive evidence of a role for genetic factors in RA. The investigation of monozygotic (MZ) twins revealed increased concordance rates of RA compared to dizygotic twins. The MZ concordance rate for RA is four times higher than the dizygotic (DZ) twin concordance rate, signifying a heritability of 40–60% (Lawrence 1969; Aho et al. 1986; Silman et al. 1993). The overall MZ twin concordance rate is 12–15%. These twin analyses support an upper limit to the genetic contribution to RA. An interesting study found smoking to be a predictor primarily in the subset of patients with RA-associated HLA-DRB1genotypes, illustrating that the genetic and environmental factors could interact in predisposing to RA (Padyukov et al. 2004).

Susceptibility HLA genes within the MHC region

Association of HLA–DRB1 SE alleles in RA:

In 1987, Gregerson et al. first elaborated a connecting hypothesis for the association of different HLA-DRB1 specificities associated with RA, termed the ‘SE hypothesis’. In this they hypothesized an association between RA and HLA-DRB1-SE, including DRB1*04 and DRB1*01 alleles. They showed that RA is associated with specific HLA-DRB1 (DRB1) alleles that encode a conserved sequence of amino acids, (70QRRAA74, 70RRRAA74 or 70QKRAA74) consist of residues 70–74 in the third hyper variable region (HVR3) of the DRb1 chain (Gregersen et al. 1987). These residues constitute a helical domain forming one side of the antigen binding site, a site likely to affect antigen presentation.

The SE hypothesis assumes that these specific class II molecules are directly involved in the pathogenesis of RA. The mechanism underlying SE–RA association is ambiguous. Familiar hypotheses attribute it to presentation of arthritogenic antigens (Wucherpfennig and Strominger 1995), or T-cell repertoire selection (Bhayani and Hedrick 1991). However, it should be indicated that data supporting antigen-specific responses as the primary event in RA are indecisive. Additionally, various non-RA human diseases (Weyand et al. 1994; Tait et al. 1995), and experimental animal models of autoimmunity (Ollier et al. 2001) have shown to be associated with SE-coding alleles as well. Moreover, the antigen-presentation hypothesis is arduous to reconcile with SE allele-dose effects on RA penetrance and disease severity (Holoshitz and Ling 2007).

Table 1 Cytogenetic loci of RA susceptibility genes and their function.

Association of HLA–DRB1 alleles with serum anti-CCP antibody:

Lately, studies have proved that there is a significant association between the SE and RA in RA patients who are anti-CCP antibody-positive (Huizinga et al. 2005; Irigoyen et al. 2005; Michou et al. 2008; Ding et al. 2009). For example, a significantly greater prevalence of anti-CCP antibodies was found in those who carried two SE alleles than those with one or none (85, 58 and 30%, respectively). Hence, the SE seems to predispose to anti-CCP positive RA and the development of ACPAs may be a path variable to explain the association of the SE with RA susceptibility and/or severity.

Modelling studies have shown that the SE P4 pocket of the HLA-DRB1 gene should bind citrulline more efficaciously than arginine. Studies in mice carrying the HLA-DRB1*0401 transgene have evidenced that converting a residue from arginine to citrulline leads to enhanced T-cell activation and increased binding of peptides by the SE (Hill et al. 2003). Hence, it has been hypothesized that smoking leads to elevated citrullination of proteins. Carriage of SE alleles in this environmental background increases susceptibility to RA because they bind citrullinated peptides more strongly and stimulate an exaggerated T-cell response. The exaggerated T-cell response, in turn, may drive the increased autoantibody production by B-cells, containing anti-CCP antibodies, seen in RA (Auger et al. 2005; van der Helm-van Mil et al. 2006).

van der Woude et al. (2009) showed that ACPA-positive and ACPA-negative RA have a similar heritability of 66%. This means that genetic predisposition also plays an important role in the pathogenesis of ACPA-negative RA, for which most individual genetic risk factors remain to be identified.

Non-HLA genes within the MHC region

TNF- α:

Non-HLA genes within the MHC have also been examined for association with RA. Genes for tumour necrosis factor (TNF) lies in the class III region of MHC, ∼250 kb centromeric of the human leukocyte antigen (HLA)-B locus and 850 kb telomeric of HLA-DR. Until now, five polymorphisms have been described within the TNF-α gene. Four polymorphisms consist of a guanine (G) to adenine (A) transition at positions –376, –308, –238 and –163 (Wilson et al. 1993), and the fifth polymorphism consists of a cytosine (C) insertion in a C-stretch starting at position +70 (Brinkman et al. 1995). All known TNF-α polymorphisms are situated in the inner region of the gene that is central to the transcriptional regulation of TNF-α expression. Single base alterations in such regions may have dynamic effects on gene expression (Matsuda et al. 1992).

One of the most studied results suggest that TNF-α gene promoter polymorphisms influence the outcome of this chronic disease. Despite this evidence, the value of genotyping RA patients in order to define their clinical course will remain unproven until a proper prospective evaluation of this cohort of patients validates this hypothesis.

Genetic risk factor located outside the MHC region

PADI4:

The genetic variant, PADI4 gene is located on chromosome 1 (1p36). The PADI4gene encodes the type 4 peptidylarginine deiminase enzyme, which catalyses the posttranslational modification of arginine to citrulline, producing citrullinated proteins (table 1; figure 1) (Vossenaar et al. 2004). The mechanism by which PADI4genotype may influence RA susceptibility has not yet been annotated. Antibodies to these citrullinated peptides are extremely specific for RA and usually precede the development of disease, advocating an essential role in RA pathogenesis. PADI4was the first non-HLA genetic risk factor known to be associated with RA, especially in Japanese population (Suzuki et al. 2003). Association has also been observed in Korean and North American populations (Plenge et al. 2005; Kang et al. 2006). Studies in Spanish, Swedish and UK populations declared no evidence for association of PADI4with RA (Caponi et al. 2005; Martinez et al. 2005). The strongest association has been authenticated for single nucleotide polymorphism (SNP) located in intron 3 (341-15A →T) of PADI4, called PADI4_94(rs2240340) and a meta-analysis revealed a significant association between RA and the PADI4_94 SNP in Asian community (Takata et al. 2008).

Figure 1
figure 1

Peptidylarginine deiminase enzyme catalyses the posttranslational modification of arginine to citrulline, generating citrullinated proteins.

Possible other explanations for this disparity are the presence of other genetic or environmental factors that interact with the genetic factor in a specific population, thereby affecting disease susceptibility or enrichment of genetic variants in one population but not in the other. Although the peptidylarginine deiminases are involved in the generation of ACPA, there is no compelling evidence supporting PADI4genotypes associated with ACPA levels — or ACPA-positive disease in appropriate (van der Helm-van Mil and Huizinga 2008). Recently in India, Panati et al. (2012) investigated polymorphism in exon 4 (padi4_104, [rs1748033]) of PADI4showed significant association of ‘C’ allele with RA in the study population (P = 0.0008). Polymorphism in exon 3 (padi4_92, [rs874881]) also exhibited moderate association with the disease (P= 0.075). However, no association of the disease was found with the SNPs padi4_89 (rs11203366) and padi4_90 (rs11203367) in exon 2 of PADI4.

PTPN22

The minor allele of a nonsynonymous SNP (rs2476601, 1858C →T, R620W) in the PTPN22gene, positioned on chromosome 1p13, has been found to be associated with RA. The PTPN22R620W was increased in RA patients versus healthy controls in studies of multiple North American and European Caucasian populace, but not in Koreans (Kochi et al. 2009). The PTPN22gene is a compelling biological candidate for engrossment in autoimmune diseases. It has been found to be expressed in a class of immunologically relevant tissues (Begovich et al. 2004) and encodes the intracellular protein lymphoid tyrosine phosphatase (LYP). Protein tyrosine phosphatases (PTPs) play a crucial role in signal transduction and are integral in the T-cell antigen receptor (TCR) signalling pathway. LYP itself is known to be an effective inhibitor of T-cell activation (table 1) (Hill et al. 2002).

STAT4:

The signal transducer and activator of transcription 4 (STAT4) gene is stationed on the long (q) arm of chromosome 2 between positions 32.2 and 32.3 is another non-MHCgene associated with RA pathogenesis (Remmers et al. 2007). The JAK/STAT pathway is the signalling target of a multitude of cytokines that are thought to perform biologically significant roles in rheumatoid synovial inflammation (Walker and Smith 2005). Specifically, STAT4, which encodes STAT4, transmits signals induced by several key cytokines, including IL-12, IL-23, and type I interferons (IFNs) (table 1) (Watford et al. 2004). In response to cytokines and growth factors, STAT family members are phosphorylated by the receptor-associated kinases, and then form homodimers or heterodimers that translocate to the cell nucleus where they act as transcription activators.

Association of STAT4with RA was determined through a combination of linkage and candidate gene studies. In contrast with HLA-DRB1and PTPN22, the association of STAT4with RA is more modest. Four polymorphisms in tight linkage disequilibrium (i.e. rs11889341, rs7574865, rs8179673 and rs10181656) form a susceptibility haplotype which is tagged by the T allele (rs7574865), have the strongest reported association with RA (Remmers et al. 2007). Association of STAT4variant (rs7574865) with RA was confirmed in patients from European, North American and Asian descent (Zervou et al. 2008; Lee et al. 2010). Europeans appear to have the lowest (21.4%) and Asians the highest (32.0%) prevalence of the rs7574865 variant among the populations studied (Lee et al. 2010). Stratification of RA patients according to the presence of ACPA antibody disclosed a statistically significant association between the rs7574865 variant and RA in both ACPA-positive and ACPA-negative RA patients versus controls (Orozco et al. 2008). In 2012, recent data from Spanish population suggested that patients with early arthritis, who are homozygous for the T allele of rs7574865 in STAT4, may develop a more severe form of the disease with increased disease activity and disability (Lamana et al. 2012).

TRAF1-C5:

The TNF receptor-associated factor 1 (TRAF1) and complement component 5 (C5) genes are located on chromosome 9 is a member of the TNF receptor-associated factor (TRAF) family, a group of adaptor proteins that bond TNF receptor family members (for example, TNF- α) to downstream signalling (Arch et al. 1998). The molecules are involved in signalling pathways that play a role in cell proliferation and differentiation, apoptosis, bone remodelling and activation or inhibition of cytokines (table 1) (Speiser et al. 1997). Interestingly, GG homozygotes at the TRAF1-C5SNP rs3761847 with RA have a substantially increased risk of death (hazard ratio 3.96, 95% confidence interval 1.24 to 12.6, P = 0.02) from malignancy or sepsis, conceivably allowing identification of patients for appropriate screening (Panoulas et al. 2009).

CD244:

Recent studies on the molecular mechanisms of signalling lymphocyte activation molecule family members, including CD244, which is one of the molecules that activates or inhibits natural killer cells have indicated that they play critical roles in the immune system and in autoimmune diseases (table 1) (Veillette 2006). The chromosomal location of a cluster of differentiation 244 (CD244) gene is 1q23.1. The Japanese study identified RA susceptibility alleles of two functional SNPs in CD244 (rs3766379 and rs6682654) that were found to be associated with ∼1.5–1.7-fold higher expression levels of CD244 (Suzuki et al. 2008). Newly, determination of the association between SNPs in CD244 and susceptibility to RA and SLE in a Korean population does not show allelic association with susceptibility to RA (Cho et al. 2009).

CTLA4:

Genes involved in the regulation of T-cell responses may be primary determinants of susceptibility to RA. CTLA4 is cytogenetically located in chromosome 2q33 (table 1). Polymorphisms within the CTLA4gene appear to be associated with RA (Lee et al. 2003; Zhernakova et al. 2005; Suppiah et al. 2006), an G →A SNP in the 3\(^{\prime }\) untranslated region (CT60; rs3087243) has received more thorough investigation, especially in European populations (Suppiah et al. 2006).

A large cohort (n cases/controls= 2370/1757) from the North American Rheumatoid Arthritis Consortium (NARAC) and the Swedish Epidemiological Investigation of Rheumatoid Arthritis (EIRA) collections, provided support for an association of CTLA4(CT60 allele) with the development of RA, but only in the NARAC cohort (OR = 1.1, 95% CI: 1.0–1.2; P = 0.004) (Plenge et al. 2005). When those results were combined with previously published data for CTLA4, it demonstrated continued evidence of association with RA (OR = 1.1, 95% CI: 1.0–1.2; P= 0.01) (Plenge et al. 2005). These earlier results correlated well with a recent meta-analysis which ascertained an association of CTLA4gene polymorphism with RA in Caucasians (OR = 0.9, P= 1.8 × 10 −3) which also reported that CTLA4embellished the development of ACPA-positive as compared with ACPA-negative RA (Daha et al. 2009). Analogous to HLA-DBR1SE and PTPN22, these reports clearly indicate that CTLA4influences the development of RA only in ACCP-positive patients and is supporting the evidence pointing to a divergence in pathology dependent anti-CCP status.

Chromosome 6q23:

Two SNPs, rs6920220 (A allele) and rs10499194 (C allele), were found to be independently associated with ACPA+ disease. Both SNPs map to a single linkage disequilibrium block spanning ∼60 kb in a region on chromosome 6q23 that lacks known genes or transcripts. The closest genes are oligodendrocyte lineage transcription factor 3 (OLIG3) and tumour necrosis factor α-induced protein 3 (TNFAIP3). The latter is of potential importance to RA pathogenesis, as the protein TNFAIP3 acts as a negative regulator of NF- κB (Wertz et al. 2004). So far, however, functional relevance of the reported polymorphisms is unascertained (table 1).

rs6920220 was initially identified in ACPA+ patients with RA (minor allele OR = 1.38) originating from the UK (Wellcome Trust Case Control Consortium 2007). rs10499194 was identified in North American ACPA+ patients (Plenge et al. 2007a, b). However, although the intergenic region is certainly associated with RA susceptibility, the involvement of the TNFAIP3gene is yet to be confirmed.

Other RA susceptibility genes

A meta-analysis of obtainable data from genomewide association studies of RA exhibited strong evidence for association of the CD40 gene with RA susceptibility (Raychaudhuri et al. 2008). CD40, which is expressed on the surface of B cells, monocytes and dendritic cells, interacts with CD154 on T cells. This interaction is important in immunoglobulin class switching, memory B cell development and germinal centre formation.

An association of SPRED2 locus was reported in an expanded meta-analysis of six genomewide association studies in Caucasian RA samples from the US, UK, Sweden and Canada, all of whom were anti-CCP antibody positive (Stahl et al. 2010). The associated SNP mapped to intron 1 of the gene, which is involved in regulating CD45+ cells via the Ras-MAP kinase pathway.

Nongenetic factors

A case–control study in Denmark addressed a large number of environmental factors potentially involved in the aetiology of RA (Pedersen et al. 2006). Upon dichotomization of patients with RA according to the presence or absence of anti-CCP antibodies, we show that environmental risk factors differ considerably between anti-CCP-positive and anti-CCP negative RA. One of the most classic genetic risk factors for an autoimmune disease, the shared epitope in RF seropositive RA, is strongly influenced by the presence of a defined environmental risk factor, smoking, in the population at risk.

Similar to genetic associations, a role for environmental factors such as body weight, smoking and blood transfusion make a modest contribution to disease risk and was also confirmed by replicating these findings in large populations (Firestein 1997; Symmons et al. 1997).

Detection of HLA and non-HLA polymorphisms

The presence of HLA-DR4 and HLA-DRB1 increases the risk of RA 7-fold (Weyand and Goronzy 2000). Long time ago, HLA-A, HLA-B, and HLA-C antigens were identified with the conventional microcytotoxicity test in which well-validated antisera were used. For HLA-DR antigen determinations, enriched suspensions of B-cells were prepared by Legrand et al. (1984).

HLA-DRB1 alleles were genotyped by polymerase chain reaction (PCR), sequence-specific oligonucleotide probe hybridization method (SSOPH), which suggests that presence of double *04 SE is associated with a higher risk of developing amyloid A amyloidosis in Japanese patients with RA (Migita et al. 2006).

In 2011, Naqi et al. genotyped HLA-DRB1 alleles in Pakistani patients with RA by using low resolution PCR- sequence specific primer (SSP) method and concluded that HLA-DRB1*04 was expressed with significantly increased frequency in patients with RA. HLA-DRB1*11 was expressed statistically significantly more in control group as compared to rheumatoid patients indicating a possible protective effect (Naqi et al. 2011).

Recently, Mourad and Monem (2013) genotyped HLA-DRB1 alleles by PCR-SSP method. The results indicated that HLA-DRB1*01, HLA-DRB1*04, and HLA-DRB1*010 alleles were related to RA, while the HLA-DRB1*11 and HLA-DRB1*13 protect against RA in the Syrian population.

The SSOPH and PCR-SSP approaches, both suffer from the limitations of the need for large numbers of probes or primers and for frequent updating in response to newly published alleles. Direct sequencing is increasingly attractive as a general method for HLA typing.

A number of non-HLA genes outside of the MHC region also exhibit an association with RA. Martinez et al. (2005) analysed PADI4 polymorphism by TaqMan assays and suggested that PADI4 polymorphisms do not play a role in susceptibility to RA in European population. This study was contrary to findings on Japanese population and concordant with those of previous British and French studies.

Pradhan et al. (2012) genotyped PTPN22 1858C/T polymorphism by PCR–RFLP method and proposed that there is no direct association between PTPN22 1858C/T polymorphism and RA in patients from western India.

Recently, Chang et al. (2013) genotyped 17 tag SNPs across the PAD locus using MassARRAY matrix-assisted laser desorption ionization-time-of-flight mass spectrometry system, and suggested that PADI2 is significantly associated with RA and may be involved in the pathogenesis of the disease.

Conclusions: genetic studies in RA

The last few years have seen tremendous achievements in the identification of RA susceptibility genes. Since 2011, more than 20 robustly associated loci have been found to be involved, mainly with ACPA-positive disease. Still, it is likely that many more remain to be brought to light. Additionally advancements in molecular genetics, such as microarray chips, will allow synchronous large-scale differential identification of thousands of genetic polymorphisms segregating with RA. One of the first genomewide association analyses in RA found several new candidate SNPs for RA and chronic inflammatory arthritis. They performed a replication analysis in an independent subset of SNPs, from which KLF12 emerged as a new candidate susceptibility gene for RA (Julià et al. 2008). In conclusion, a suggested anti-CCP and genetic profile consisting of HLA-DRB1, TNF- α, PADI4, PTPN22, STAT4, TRAF1-C5, CD244, CTLA4 and chromosome 6q 23 shows promise to identify those patients who may benefit from a more aggressive treatment regimen or immediate biological biotherapy. The cost of such an extensive panel may be justified by the benefits to the patients and management of disease burden in the long term.

Future research and treatment directions

Although previous case–control studies in different populations have proposed a accessible association of these alleles with RA, controversial results have been disclosed about the significance of genetic variants in affinity with the role of autoantibodies seropositivity in the development of RA. Ethnic differences may play a role in the confliction results among these association studies. However, results show that the selective genotyping approach is more efficient in detecting common variants than detecting rare variants and it is efficient only when the level of declaring significance is not stringent. In summary, the selective genotyping approach is most suitable for detecting common variants in candidate gene-based studies. There is a dearth of such genetic and seropositive association studies in India and we are contriving to undertake such studies (Vasanth and Nalini 2011). Eventually, the contribution of the autoantibodies and genetic markers will lead to an extended knowledge on the pathogenesis of RA and the studied genetic markers will suggest the combined predictive value for RA disease. This knowledge may lead us to make possible earlier preventive strategies as well as the development of more appropriate and effective treatment options in RA.