Introduction

Rheumatoid arthritis (RA) causes an inflammatory polyarthritis affecting 1 % of the population and is characterized by a chronic erosive arthropathy typically of the small joints [1, 2]. RA is a serious public health problem that results in excess disability, morbidity, and mortality [3••, 4]. While the etiology of RA is currently not established, genetics, lifestyle factors, and biomarkers have been associated with risk of RA development in many research investigations over the past few decades [5••]. The current paradigm for RA pathogenesis is that individuals with increased genetic susceptibility progress through preclinical phases prior to clinically apparent RA [6•, 7]. These preclinical RA phases include genetic risk, immune activation without symptoms (characterized by upregulation of cytokines and autoantibody production), arthralgias and other non-specific symptoms, and inflammatory arthritis prior to the development of classifiable RA [8]. Epidemiologic investigations have been helpful to guide basic and translational studies that aim to elucidate the biology of these pre-clinical RA transition phases.

Local loss of immune tolerance may trigger an inflammatory milieu and dysregulation of the adaptive immune system, causing systemic inflammation that eventually manifests as RA. Several anatomic sites have been posited as possible initiating sites for RA. Mucosal surfaces, in particular, are speculated to be the local sites of initial innate immune system dysregulation [9]. The anatomic sites that may be important in the development of RA include the lung, respiratory tract, periodontum, gut, bladder, and reproductive tract [9]. However, the lung and respiratory tract have the most evidence in support of being an initiating site for RA [10•].

While much remains to be elucidated concerning the biologic mechanisms of RA pathogenesis, RA prevention strategies are actively being pursued. Epidemiologic studies have identified factors, such as genetics, family history, environment, and biomarkers that allow investigators to target those at highest risk for RA prevention [11•, 12••]. The association of RA with potentially modifiable factors, in particular smoking and obesity, suggests that lifestyle interventions such as smoking cessation or weight loss may prevent or delay RA onset [13•, 1416]. Since many effective targeted and non-targeted drugs for RA now exist, several clinical trials using pharmacologic interventions for RA prevention are currently underway among those at especially high risk for RA due to seropositivity or early symptoms [13•]. Understanding the pathogenesis of RA remains crucial to the success, interpretation, and implementation of these ongoing as well as future RA prevention studies.

The aim of this article is to review the evidence supporting the lung and respiratory tract as an initiating site for RA pathogenesis, with a focus on how the epidemiology of smoking has informed investigations into the biology of pre-clinical seropositive RA pathogenesis in the lung.

Cigarette Smoking, Cessation, and Overall Risk for Seropositive RA

Cigarette smoking is the best established lifestyle risk factor for RA and may provide the external trigger for those asymptomatic but at increased genetic risk for RA [1724]. There is evidence that smoking is important throughout the preclinical phases of RA development (see Fig. 1). The association of smoking is particularly strong for seropositive RA (defined as either rheumatoid factor [RF] or anti-cyclic citrullinated peptide [CCP] positivity). Smoking has a clear association with RA risk and may contribute up to 35 % of the attributable risk for seropositive RA [25, 26•]. In a large meta-analysis that included 11 studies, 13,885 RA cases among a total of 593,576 individuals, current smokers had an odds ratio (OR) of 1.64 for seropositive RA compared to never smokers [17]. This effect of smoking for increased RA risk was more potent in men compared to women [17]. Current male smokers had an OR of 3.91 for seropositive RA compared to male non-smokers [17]. Both smoking status and intensity are associated with increased RA risk with clinically significant risk most apparent for those with >10 pack-years [27, 28••].

Fig. 1
figure 1

Schematic of proposed biologic mechanisms linking smoking and the lung to preclinical phases in the pathogenesis of anti-citrullinated protein antibody (ACPA)-positive rheumatoid arthritis (RA). An asymptomatic individual with high genetic risk, such as the HLA-DRB1 shared epitope, is exposed to environmental triggers such as cigarette smoking. This induces mucosal inflammation in the airways and alveoli of the lung. Activated enzymes, including peptidylarginine deiminase (PAD), stimulate citrullination of proteins at these sites to form neoantigens. Antigen presenting cells (APCs) in the innate immune system present these neoantigens through the HLA-DRβ1 protein to T cells that results in adaptive immune system dysregulation and leads to systemic inflammation. Stimulated B cells in the respiratory mucosa produce ACPA. Systemic inflammation leads to non-specific symptoms such as fatigue and arthralgias. Autoantibodies exhibit specificity to peripheral joints eventually leading to synovitis and the clinical diagnosis of ACPA-positive RA

Studies suggest that smoking cessation may decrease RA risk [24, 27, 29••]. Among women followed prospectively for 26 years in the Nurses’ Health Study, risk of RA for past smokers was similar to the RA risk of non-smokers ≥20 years after smoking cessation [27]. Similarly, in the Swedish Epidemiological Investigations of RA that compared incident RA cases to population-based controls, former smokers had similar RA risk compared to never smokers ≥20 years after cessation [24]. In the prospective Swedish Mammography Cohort, risk of RA was still significantly elevated 15 years after smoking cessation compared to never smokers [29••]. While the risk for RA did not return to the baseline RA risk of never smokers in this study, risk steadily decreased over increasing duration of smoking cessation [29••].

While smoking is clearly associated with seropositive RA, this association is attenuated or null for seronegative RA [17, 28••, 30]. Some studies suggest that seronegative RA may be a heterogeneous collection of diseases without a common pathogenesis or distinct risk factors [31]. A recent study aimed to define a homogeneous population of seronegative RA patients by removing patients with detectable RA-related autoantibodies on sensitive research-only assays for anti-citrullinated protein antibodies (ACPA) as well as patients with HLA-B27 where spondyloarthropathy may have masqueraded as seronegative RA [32••]. Using these methods, distinct genetic differences were detected between patients with seropositive and seronegative RA implying separate pathogenesis for each disease phenotype. However, most studies rely on the clinical classification of seronegative RA which has inherent clinical heterogeneity. For this reason, the pathogenesis of seropositive RA, particularly ACPA-positive RA, is best developed and is therefore the focus of this article. We refer to ACPA as the general presence of any research or commercial assay for anti-citrullinated protein antibodies, while we refer to CCP (and particular assays, such as CCP2, CCP3, or CCP3.1) as the specific commercial assay.

Cigarette Smoking and Transitions Between Preclinical Phases of RA

Smoking has been associated with increased risk for progression among preclinical RA transition states (Table 1). Among patients with arthralgias and detectable serum IgM-RF or CCP2, smoking may hasten the emergence of classifiable RA [37••]. However, when evaluating all patients with arthralgias, there is no clear association of smoking with progression to detectable inflammatory arthritis on physical examination [38•]. Among first-degree relatives without RA in the Studies for the Etiology of RA, smoking >10 pack-years, in addition to current smoking status, was associated with twofold increased risk of developing inflammatory joint signs [36•]. Most of these unaffected relatives were asymptomatic and further away from RA development compared to those with seropositive arthralgias, so the particular timing of smoking in relation to preclinical RA stages is likely important.

Table 1 Selected epidemiologic studies investigating smoking in RA preclinical phase transitions

Smoking may also be important in the development of RA-related autoantibodies. Patients with chronic obstructive pulmonary disease (COPD) without RA have increased levels of CCP and RF compared to controls. Smoking is associated with development of RA-related autoantibodies (CCP, ACPA, and RF) in patients that later develop RA [39•, 40] and in those at high genetic risk [33]. A Swedish study analyzed banked samples prior to the onset of RA patients and examined the HLA shared epitope, CCP, particular ACPA targeting specific antigens, and smoking. Smoking was associated with presence of ACPA antibodies closer to disease onset. Ever smokers with the shared epitope had a fourfold increased odds of being CCP2 positive as well as other ACPA within 10.5 years of RA diagnosis compared to never smokers without the shared epitope [34••]. Since the presence of circulating RA-related autoantibodies is associated with very elevated risk for RA, smoking may propagate the generation of RF, CCP, and other ACPA [41, 42••].

Finally, other inhaled irritants such as silica, air pollution, and textile dust may also increase the risk for RA, particularly seropositive RA [4346, 47•]. While the epidemiology of these environmental factors is less developed compared to smoking, the associations of these other environmental factors emphasize that the lung, and potentially other mucosal sites, may provide the common link and micro-environment important for a variety of biologic processes that ultimately present clinically as RA.

HLA–Smoking Statistical Interaction Provides Clues to the Biology of Transitions of Preclinical RA

The HLA shared epitope, present in the Major Histocompatability Complex Class II on chromosome 6, is the strongest genetic risk factor for RA, among many other known genetic risk factors [48, 49, 50••]. Shared epitope alleles are the site of transcription for specific amino acid haplotypes in the HLA-DRβ1 protein that are important in antigen processing and presentation and thus important in immune function [51]. Many epidemiologic studies have detected a statistical interaction between the shared epitope and smoking [52, 53, 54•, 55, 56, 57•, 58•, 59]. This finding provided background for the hypothesis that smoking may induce autoantigens interacting with the innate system that stimulates the adaptive immune system and eventually leads to joint specificity through molecular mimicry of autoantibodies and, finally, clinical symptoms of RA [6065].

Specific amino acid positions within the peptide binding groove of the HLA-DRβ1 protein are associated with especially high risk of seropositive RA [51]. These sites also have statistical interactions with cigarette smoking for risk of seropositive, especially ACPA-positive, RA [52, 53, 54•, 55, 56, 57•, 58•]. In a trans-ethnic study using data from the Nurses’ Health Studies in the US, the Swedish Epidemiological Investigations of RA, and the Korean RA Cohort Study, heavy smoking (defined as >10 pack-years) interacted with HLA-DRβ1 positions 11 and 13 for increased risk for seropositive RA [35••]. Specific amino acid haplotypes conferred increased risk of seropositive RA (RF or CCP in the Nurses’ Health Study; CCP in both the Swedish and Korean studies) [35••]. For example, presence of valine at HLA-DRβ1 position 11 or histidine at position 13 significantly increased seropositive RA risk while presence of serine at HLA-DRβ1 positions 11 or 13 significantly decreased RA risk [35••]. These results suggest a physical interaction of a neoantigen, specifically induced by smoking, with the HLA-DRβ1 protein. However, the neoantigen(s) produced by this process are yet to be definitively identified.

Alveolar and Airway Inflammation in Preclinical RA

While pulmonary involvement of patients with RA is well described, particularly interstitial lung disease (ILD) as an extra-articular RA disease manifestation, data suggest that lung inflammation can precede RA diagnosis [66•, 67••]. While rates vary depending on RA duration, smoking exposure, medication history, and sensitivity of the testing modality, 8–36 % of RA patients have pulmonary function tests suggesting airways obstruction, 60–80 % of RA patients have airways disease on high-resolution computed tomography (CT) scans, and parenchymal disease is present in up to 79 % of RA patients [10•].

In a cross-sectional study of newly diagnosed RA patients, Wilsher and colleagues evaluated risk factors for lung imaging abnormalities on 60 patients within 1 year of RA diagnosis [68]. They found that bronchial wall thickening was present in 50 % of patients, bronchiectasis in 35 %, ground glass opacities in 18 %, and reticular changes in 12 % [68]. Additionally, CCP and RF correlated with decreasing diffusion capacity to carbon monoxide, suggesting significant alveolar damage in patients with early seropositive RA [68]. Similarly, Fischer and colleagues described significant disease burden in patients with respiratory symptoms who had CCP positivity without RA [69]. On high-resolution CT scans of the chest, 54 % had isolated airways disease, 14 % had ILD alone, and 26 % had airways disease and ILD [69]. A subset who had bronchial biopsies performed showed significant inflammation in the airways [69].

A study in Sweden compared imaging findings on high-resolution CT scans of the chest among 105 untreated patients with early RA to healthy controls [70••]. Parenchymal lung abnormalities were present in 63 % of CCP-positive RA patients compared to only 37 % of CCP-negative patients and 30 % of controls, suggesting that lung inflammation is particularly relevant to ACPA-positive RA [70••]. Airway changes on CT were detected more frequently in RA patients (66 %) compared to controls (42 %), suggesting that local airways inflammation is already present in patients recently diagnosed with RA [70••]. Another study by the same investigators examined bronchial biopsy in similar patients with untreated early RA and found significant local inflammation in the lung at the alveoli and airways [71••]. Adaptive immune cells (B cells, T cells, and plasma cells) were much more likely to be present and activated in CCP-positive patients compared to the CCP-negative RA patients or controls in both bronchial biopsies as well as bronchioalveolar lavage [71••].

These pulmonary abnormalities may have been due to factors other than preclinical RA disease processes. In particular, cigarette smoking may cause many of these abnormalities. However, similar airways changes are also seen in patients with RA who were non-smokers suggesting that smoking alone does not cause these abnormalities. When adjusted for smoking, differences remained between ACPA-positive RA, ACPA-negative RA, and controls. Since patients in these studies had very early RA and were untreated, it is less likely that other clinical factors such as decreased physical activity or medication use explain these differences.

Demoruelle and colleagues investigated pulmonary function and imaging among subjects with seropositivity (positive CCP2 or CCP3.1 ± ≥2 RF isotypes) without inflammatory arthritis [72]. This group at very elevated risk of RA was compared to autoantibody-negative controls who were unaffected first-degree relatives of RA patients as well as patients with early RA. Seropositive subjects showed significant burden or pulmonary abnormalities compared to controls [72]. Of those who were seropositive, 76 % had airways disease compared to only 33 % of controls. Fifty percent of seropositive cases without RA had bronchial thickening compared to only 13 % of seronegative controls; 69 % of seropositive cases had air trapping compared to 7 % of seronegative controls [72]. Early RA patients had more bronchial wall thickening and parenchymal abnormalities but were otherwise similar to seropositive cases without RA. Smoking was a matching factor and similar trends were seen when analyzing only never smokers.

Together, these studies in early RA and subjects with seropositivity without RA are compelling to emphasize the inflammation that occurs in airways and alveoli prior to disease. Further, many of these patients with RA had no evidence of ILD on biopsy or imaging arguing that these abnormalities are important for ACPA-positive RA and not only patients with RA who develop ILD.

Citrullination of Antigens in Respiratory Mucosa

While studies demonstrate excess local inflammation in the alveoli and airways in preclinical RA, the biological processes occurring here are challenging to study since these individuals are difficult to identify and are otherwise often healthy. Citrullination is a process by which arginine residues on proteins, such as fibrinogen and enolase, are converted to citrulline by the enzyme peptidylarginine deiminase (PAD) [73•]. Autoantibodies to citrullinated protein antigens are highly specific to RA and presence of these in preclinical phases of RA greatly increases the risk of progression to clinical RA. Therefore, citrullination has been postulated to be one of the biologic processes occurring locally in the lung that may initiate RA pathogenesis. Citrullination may be the important first step in the loss of immune tolerance, forming self-antigens that are then presented to T cells that eventually result in activated B cells producing ACPA [74, 75•].

Reynisdottir and colleagues analyzed fluid from bronchoalveolar lavage and found increased citrullination of proteins in CCP2-positive early untreated RA patients compared to CCP2-negative RA and healthy controls [71••]. Ytterberg and colleagues performed proteomic screening of bronchial and synovial biopsies of RA patients to find proteins in common at both sites that might link the lung and joints [76••]. They found that two citrullinated vimentin peptides were identified in the majority of these patients at both anatomic sites, perhaps providing a link between initial breakdown of the innate immune system locally and the lung and inflammation in the joints after adaptive immune system dysregulation [76••].

While these findings are provocative, they require replication and would likely not explain the full spectrum of ACPA-positive RA pathogenesis. Smoking may be the extrinsic factor that causes local mucosal inflammation in the lung, upregulating PAD in susceptible hosts, leading to citrullination of proteins that form neoantigens that stimulate the immune system and induce autoimmunity. However, there may be other parallel biologic processes that also lead to RA. These might include a collection of neoantigens (instead of a single instigator), inflammation at other mucosal sites (in particular the gut and periodontum), or genetic susceptibility factors that are currently undiscovered.

The Lung as a Site of Initial RA-Related Autoantibody Formation

While the particular immunologic mechanisms occurring locally in the lung prior to RA development remains to be fully elucidated, data support the notion that autoantibody development is initiated specifically in the lung (Table 2). Willis and colleagues investigated individuals at high risk of RA due to family history or detectable serum antibodies [77••]. They evaluated a variety of RA-related autoantibodies in the serum and induced sputum of seronegative relatives, seropositive relatives, and early RA patients. They found that 65 % of seropositive at-risk relatives had evidence for at least one RA-related autoantibody in the sputum (CCP2, CCP3, CCP3.1, IgG-RF, IgA-RF, or IgM-RF) compared to 86 % of early RA subjects, and only 35 % of seronegative at-risk relatives [77••]. In addition, most of those with detectable serum RA-related autoantibodies had the same autoantibody detectable in the sputum [77••]. The authors conclude that these results provide evidence that autoantibodies are initially locally produced in the lung prior to epitope spreading and subsequent detection in the serum. Some argued that sputum may have been contaminated from oral secretions in this study, especially relevant since the periodontum is another mucosal surface that may be important in RA pathogenesis [80]. In particular, the bacterium Propionibacterium gingivalis synthesizes a type of PAD that may induce local citrullination in the gingiva [81, 82].

Table 2 Selected epidemiologic and translational studies investigating the lung in RA preclinical phase transitions

Reynisodottir and colleagues also studied whether RA-related autoantibodies were present locally in the lung of newly diagnosed untreated RA patients by obtaining specimens from bronchoalveolar lavage [70••]. They found that CCP-positive RA patients had higher levels of ACPA autoantibodies in the bronchoalveolar lavage samples than in the serum, suggesting that the autoantibody production may have started in the lung [70••]. These findings also correlated to patients with structural lung disease on high-resolution CT scans of the chest [70••]. Only one CCP-negative RA patient and no healthy controls had detectable ACPA in the lung [70••]. Since the fluid was collected by bronchoalveolar lavage, it was unlikely that oral flora contaminated these specimens. While preclinical cases were unavailable in this study, the RA patients were all untreated and were recently diagnosed arguing that these changes likely preceded the clinical onset of RA.

A study by Janssen and colleagues evaluated whether other diseases involving inflammation of the mucosal surfaces might have detectable RA-related autoantibodies in the serum. They tested CCP2 in the sera of patients with periodontitis, bronchiectasis, and cystic fibrosis as well as patients with RA and healthy controls [78•]. CCP2 seropositivity was associated with RA, bronchiectasis, and cystic fibrosis but not periodontitis [78•]. The authors conclude that this provides further evidence that local inflammation of the mucosa in the lung may provide conditions necessary for citrullination and ACPA production. Studies of seropositive RA-ILD without joint involvement and the previously described study of patients with pulmonary complaints and CCP positivity without RA argue that there is a group of patients on the spectrum of RA where clinical pulmonary disease precedes or is more severe than joint symptoms [69, 83, 84•, 85, 86].

Future Directions

While much progress has been made in elucidating the pathogenesis of RA, further work is needed in nearly every phase of preclinical RA. While the shared epitope is the strongest genetic risk factor for RA, currently described genetic factors only explain 18 % of the genetic variance for RA [50••, 87•]. Other potential RA susceptibility factors such as epigenetics, the microbiome, and metabolomics may be useful in identifying high-risk individuals and further bridge genetics with environmental factors for RA development [6•, 88]. Other lifestyle factors, particularly metabolic factors such as diet and obesity, may be important in the development of RA [89•, 90•, 91•]. Future epidemiologic studies of interactions of smoking with other lifestyle factors or biomarkers may offer novel hypotheses for the biology behind preclinical RA phase transitions. While smoking is clearly important in RA etiology, many non-smokers develop RA so studying other environmental factors may provide important alternate hypotheses for RA. Studies concerning the microbiome and periodontitis have already offered other hypotheses for RA development [82, 88].

While structural abnormalities in the alveoli and airways of individuals with preclinical RA are well documented, less is known about the immunologic changes occurring locally in these patients. Most studies investigating this have used early RA patients and inferred that these changes occurred prior to clinical onset [70••, 71••]. Since ACPA-positive RA patients often develop seropositivity within 5 years before clinical onset, these transition states have not been able to be studied directly given the difficulty in identifying these patients. Since many are otherwise healthy, the societal benefit of lung biopsy or even bronchoscopy may not outweigh the risk to the individual subjects. However, if the particular citrullinated neoantigen(s) formed during these processes were identified, targeted approaches to RA prevention would likely have much more potential for success and may also have importance in identifying targets for treatment of early and established RA. Lastly, while much progress has been made concerning the natural history of RA using banked specimens of patients who later developed RA, there are only a few longitudinal studies of at risk subjects available (typically unaffected first-degree relatives or those with arthralgias or undifferentiated inflammatory arthritis) [5••, 11•, 42••, 74, 92]. Larger longitudinal studies with longer follow-up of individuals at increased risk are necessary to understand the natural history, but recruitment and retention are major challenges in this otherwise healthy population.

Preclinical disease processes may have clinical implications for patients with established RA. Several groups have recently identified respiratory mortality as a major contributor to the excess mortality of RA patients, particularly those with seropositive RA [3••, 93••, 94]. Emerging evidence suggests that this increased risk may be due to obstructive lung diseases and that RA patients are at particularly increased risk for asthma and COPD [3••, 79••, 95•, 96]. Since airways disease is particularly common in preclinical RA, these lesions may predispose RA patients to later develop clinical obstructive disease, beyond the effect of smoking. Preclinical pulmonary processes may also further subset patients with RA who later develop ILD which has known excess mortality in RA [97]. Identifying RA patients early in the development of ILD may similarly have major prevention or treatment implications and many of these processes likely are initiated in the preclinical RA phases [98••].

Finally, understanding the pathogenesis of RA may have public health implications beyond RA. Studies of excess CVD outcomes among RA patients have impacted the understanding of how chronic inflammation affects the biology of CVD and has had broad public health significance beyond RA [99103]. Similar to this framework, inflammation and autoimmunity may also be the common link between RA and increased respiratory disease burden and mortality. Bronchiolar immune tolerance loss with resultant systemic inflammation may bridge RA pathogenesis to clinical outcomes [94]. Evaluating the effect of RA on respiratory outcomes therefore has importance not only for RA patients, but may establish the roles of systemic inflammation and autoimmunity on respiratory outcomes beyond RA, which may have broad biologic, clinical, and public health implications.

Conclusions

The understanding of the pathogenesis of RA has progressed over the past two decades through both epidemiologic and translational studies. Specifically, epidemiologic investigations of smoking and risk of RA have informed the paradigm for RA development and generated novel hypotheses that have been tested in translational studies to further understand the biology of RA pathogenesis. The association of smoking with RA led investigators to initially consider the lung as a site of RA pathogenesis. Later studies associated smoking with progression to inflammatory arthritis or classifiable RA in high-risk populations. The particular association of smoking with seropositive RA, particularly ACPA-positive RA, led investigators to consider citrullination as a biologic process central to RA pathogenesis. The identification of structural lung abnormalities and local autoantibody production in the lungs add to this hypothesis. Finally, the HLA shared epitope-smoking interaction, occurring specifically in the peptide binding groove of HLA-DRβ1, offers the possibility of a physical interaction of a neoantigen induced by smoking and the immune cells where dysregulation may be initiated.