1 Introduction

Enthesitis is a hallmark of spondyloarthritis (SpA) and occurs in the axial and peripheral skeleton. In spondyloarthropathies, such as psoriatic arthritis (PsA), peripheral enthesitis is felt to precede joint symptoms and is associated with a higher degree of erosive disease [1]. Enthesitis is one of the major domains addressed in treatment guidelines [2]. The therapeutic armamentarium for treating spondyloarthropathies has expanded significantly. Therapeutic trials have also given insights into the pathophysiology of enthesitis. In the following review, we examine the pathophysiology of enthesitis, followed by its clinical evaluation, and then discuss the evidence for the relative efficacy of contemporary targeted therapies.

2 Pathophysiology of Peripheral Enthesitis

In order to understand the current therapeutic armamentarium and targeting in treating enthesitis, it is important to recognize the myriad of advances in the understanding of enthesitis pathophysiology [3]. As entheseal tissue is difficult to access, initial work depended on epidemiology to show an association of genetic markers such as HLA-B27. Initial genome-wide association studies (GWAS) suggested an association of the interleukin (IL)-23 receptor gene and Crohn’s disease. Furthermore, genes encoding the p40 subunit, shared between IL-12 and IL-23, were also found to be associated with Crohn’s disease and psoriasis [4]. Since then, multiple animal models have examined the role of genetic factors [5]. Subsequent work has shown that IL-23, but not IL-12, promotes IL-17 production from activated T-cells [4]. Knockout mice experiments confirmed that IL-23 and IL-17 were key cytokines in the development of autoimmunity. The differential response to the various classes of biologic agents and the varying manifestations of SpA have been informative and intriguing. The understanding of the pathophysiology of enthesitis includes the pathoanatomical perspective, innate and adaptive immunity, gastrointestinal flora, and overlap with the IL-23 and IL-17 axis.

2.1 Pathoanatomical Considerations

Generally, entheses are where ligaments or tendons attach to bone, but, as we discuss below, other concepts of entheses have also been advanced. The role of the enthesis is not only to enable stabilization and movement of a limb but also to allow force dissipation [6]. Force dissipation occurs within the tendon, and collagen fibrils act as a spring. When the entheseal attachment has a wide range of excursion, fibrocartilaginous change occurs at the attachment site, offering a nonrigid mechanism for the dissipation of mechanical energy [7]. Fibrocartilaginous entheses include the supraspinatus insertion, Achilles tendon, and plantar fascia attachments. In contrast, entheses that have collagen fibrils encased in bone are termed fibrous entheses and include the deltoid insertion at the deltoid tuberosity of the humerus [8]. Spondyloarthropathies typically affect fibrocartilaginous entheses [8]. It is important to note that other structures regarding the entheses, such as fat pads and bursae, also contribute to minimizing friction and dissipation of mechanical energy. The concept of an ‘enthesis organ’ recognizes that stress dissipation is not only the enthesis itself but also the surrounding tissues [9]. It also helps to explain high-resolution sonographic imaging findings, which typically demonstrate abnormalities in multiple tissues (Fig. 1).

Fig. 1
figure 1

Long axis image of the distal Achilles enthesis in a patient with spondyloarthropathy. The thickened Achilles tendon has a loss of fibrillar echotexture, and a large enthesophyte is present distally. A small amount of fluid is present in the retrocalcaneal bursa (arrow) adjacent to Kager’s fat pad. A power Doppler signal within the tendon and distal enthesis indicates neovascularity. AT Achilles tendon, E enthesophyte, K Kager’s fat pad

The discussion above regarding mechanical dissipation is particularly important for two reasons. First, it is essential to note that on imaging and in histologic samples, it may not be possible to distinguish mechanical enthesopathy from inflammatory enthesopathy, especially when only one site is involved [6, 10]. Furthermore, biomechanical factors such as obesity and activity may confound the assessment of enthesitis [11, 12]. In fact, biomechanical factors may explain the distribution of entheses involved in SpA [13]. Further evidence in support of the biomechanical influence on enthesitis expression includes animal models that demonstrated reduced enthesitis with hind limb unloading [14]. In addition, a high level of mechanical stress is associated with increased radiographic peripheral damage in patients with longstanding PsA [15].

2.2 Role of Innate and Adaptive Immunity at the Enthesis and Beyond

It has been proposed that inflammation may originate in the synovio-entheseal complex that then leads to inflammation of the surrounding articular and periarticular structures [1]. Seminal work conducted by Sherlock et al., who demonstrated that IL-23 is a key driver of enthesitis in rats and acts via previously unidentified T cells, supports this hypothesis [16]. These T cells are characterized by the transcription factor RAR-related orphan receptor γt (ROR-γt), as well as CD3+CD4−CD8− cell surface markers. Stimulation of the IL-23 receptor on these cells results in cytokine production, including IL-17 and IL-23. Cuthbert et al. found cells with similar cytokine signatures in humans [17] and reported the presence of type three innate lymphoid cells (ILC-3) in human entheseal tissue obtained from spinal surgeries, Achilles tendon rupture repairs, and knee arthroplasty. Cells from normal entheses are stimulated by IL-23 and upregulated IL-17A transcription. Furthermore, ROR-γt-expressing cells have been isolated from damaged entheses. However, IL-17A production may not be tightly linked to IL-23 in all phases of SpA, and the biology of these cytokines may overlap. van Tok et al. demonstrated that anti-IL-23R peritoneal injections prevented spondylitis and arthritis development in the HLA-B27 rat model; however, anti-IL-23R administered after the onset of arthritis and spondylitis did not suppress disease [18]. This finding is especially intriguing considering the differential efficacy of ustekinumab, an anti-IL-12/23 p40 antibody. The use of ustekinumab has demonstrated efficacy across all major domains for psoriatic diseases, including peripheral enthesitis [19, 20]; however, despite showing efficacy in a small phase II trial in patients with ankylosing spondylitis [21], ustekinumab failed to show efficacy for both primary and secondary endpoints in a larger phase III study (ClinicalTrials.gov identifier: NCT02437162). Two further trials that enrolled anti-tumor necrosis factor (TNF)-α-refractory participants with active radiographic axial SpA (NCT02438787) and active nonradiographic axial SpA (NCT02407223) were terminated. Interestingly, risankizumab, which targets the p19 subunit of IL-23, has demonstrated efficacy in treating psoriasis, PsA, and Crohn’s disease; however, in a phase II trial, risankizumab did not show efficacy in primary and secondary endpoints for treating ankylosing spondylitis [22]. It is also notable that targeting IL-17 was not effective in treating Crohn’s colitis [23, 24] or uveitis [25]. In a recent editorial, Siebert et al. proposed that the different tissue compartments involved in SpA may have different cytokine signatures [26], and also suggested there may be IL-23-independent pathways generating IL-17 in SpA, in keeping with the animal model data discussed above [26].

2.3 Contribution of the Gut to the Development of Spondyloarthritis (SpA)

Interest in the contribution of gut bacteria has stemmed from the observation that some spondyloarthropathies, such as reactive arthritis, are linked to infection in the genitourinary or gastrointestinal tracts. There has been much interest in the arthritogenic peptic theory, where bacterial antigens bind to the groove of the HLA-B27 molecule and thus propagate immune response [27]. Rats transgenic for HLA-B27 develop a disease similar to human SpA; however, when these rats are raised in a germ-free environment, they do not develop inflammatory bowel lesions or joint disease [28]. More recently, attention has switched to the gut microbiome. Preliminary studies suggest that ileal microbial flora may be altered in ankylosing spondylitis, and genes associated with the disease may affect the microbiome [29]. It is unclear whether dysbiosis is an epiphenomenon or whether it contributes to the disease through immunogenic mechanisms or via loss of gut permeability [30]. It is therefore intriguing to note that vedolizumab, an antibody that targets α4β7 integrin, used to treat inflammatory bowel disease (IBD), has been associated with worsening or the emergence of new axial and peripheral SpA [31,32,33]. These case series have identified subjects that are predominantly HLA-B27-negative who have been anti-TNF inadequate responders for their IBD. Vedolizumab controls bowel disease adequately by presumably reducing the trafficking of T cells into the lamina propria and thus reducing inflammation [34]; however, normal gut permeability may not be restored, allowing egress of bacteria or antigens [31]. Further prospective studies, through genetics and microbiome, are necessary to confirm if vedolizumab exacerbates quiescent cases of IBD or causes a paradoxical effect.

3 Epidemiology of Peripheral Enthesitis

Enthesitis is a key feature of spondyloathritis. As a group of interrelated conditions, spondyloathritides have overlapping features. Using the European Spondyloarthritis Study Group (ESSG) criteria, the prevalence of SpA is estimated at 0.01–2.5%; however, using modified New York criteria, prevalence rates of between 0.007 and 0.4% for ankylosing spondylitis and < 0.1 and 0.4% for PsA have been reported [35]. In a large cohort study of Brazilian patients with SpA based on the ESSG criteria, clinical enthesitis defined by at least one affected enthesis was present in 54% of the cohort, with the majority of these cases diagnosed as ankylosing spondylitis [36]. The prevalence of peripheral enthesitis in PsA has been reported in cohort studies, as well as in therapeutic trials. Polachek et al. estimated the clinical prevalence of enthesitis as 35% in PsA patients [37], while Ranza et al. estimated a 30% prevalence rate of enthesitis in PsA patients followed in dermatology clinics [38]. In a recent review, the prevalence of baseline enthesitis in clinical trials of PsA ranged from 24 to 83% [39]. These prevalence estimates depend on clinical evaluation of enthesitis, which varies based on the instrument used, and may not be sensitive or specific.

4 Impact of Peripheral Enthesitis

The presence of enthesitis by itself can be associated with pain and loss of function, as well as a higher level of disease activity. In a retrospective, cross-sectional analysis of the Consortium of Rheumatology Researchers of North America (CORRONA) database, Mease et al. reported higher disease activity in PsA patients with enthesitis, compared with those without enthesitis [40]. PsA patients with enthesitis were more likely to have greater pain and more likely to have work or activity impairment. Similar findings were reported in a post hoc analysis of the ustekinumab PSUMMIT 1 and 2 trials, where anti-TNF-naïve PsA patients with improved clinical enthesitis at week 24 also had improvements of physical function and health-related quality of life regardless of the American College of Rheumatology (ACR) 20 joint response [41]. In a post hoc analysis of PsA patients in two phase III trials of ixekizumab, Gladman et al. reported that 80% of patients with enthesitis or dactylitis had moderate to severe pain and discomfort scores in the five-level EQ-5D (EQ-5D-5L) quality-of-life instrument [42]. In the patients who had no tenderness in the six Leeds Enthesitis Index (LEI) areas, there was less pain and increased quality of life than patients whose LEI scores were greater than zero. Similarly, in patients with ankylosing spondylitis, a high positive correlation was noted between clinical and sonographic scores of enthesitis and the Bath AS Disease Activity Index (BASDAI) [43]. In a cohort study of SpA patients, based on the ESSG criteria, Carneiro et al. reported that enthesitis was statistically associated with axial symptoms. In addition, enthesitis was associated with higher disease activity, lower quality of life, and decreased function in this study [36]. In addition to the impact of enthesitis on function and pain, the severity of enthesitis is linked to higher levels of peripheral joint damage in PsA patients [44].

5 Clinical Evaluation of Peripheral Enthesitis

A summary of major entheseal indices used in therapeutic clinical trials is shown in Fig. 2 (see the electronic supplementary material [ESM] for a detailed table of entheseal indices). While some of the scoring systems were developed in an ad hoc manner, others were developed and validated in different patient populations. The clinical evaluation of enthesitis depends on eliciting tenderness at the site of the enthesis by finger pressure. The number and specific entheses chosen depends on the instrument used. The earliest indices stemmed from a need to examine peripheral enthesitis in ankylosing spondylitis. Mander et al. developed a 66-point index, where each site had pressure applied and tenderness was graded on a 0–3 scale [45]. The Maastricht Ankylosing Spondylitis Enthesitis Score (MASES) was developed using a reduction approach, leading to the use of 13 anatomical sites. In addition, tenderness reporting was changed to being either present or absent [46]. The MASES has subsequently been modified by adding the plantar fascia insertion (PsA MASES) [19, 47], and adding the plantar fascia, quadriceps, and patellar ligament insertions (Modified-MASES) [48]. A 12-site entheseal index (Berlin Index) was initially used by Braun et al. in a trial of ankylosing spondylitis patients treated with infliximab [49]; however, it is not clear how this score was derived or validated from the referenced publication. In contrast, the Spondyloarthritis Research Consortium of Canada (SPARCC) Index was derived based on entheses commonly involved, using prior published sonographic and magnetic resonance imaging (MRI) studies. The scoring system was then validated in the clinical trial of ankylosing spondylitis patients treated with adalimumab [50]. The San Francisco group developed a 17-entheseal index based on the modified Newcastle Enthesis Index (NEI). Of note, tenderness was scored from 0 to 3 (a score of 0 indicated no pain; 1 indicated mild tenderness; 2 indicate moderate tenderness; and 3 indicated tenderness severe enough to elicit a wince or withdrawal) [51]. For PsA studies, SPARCC, MASES, and its modified forms, which were developed in ankylosing spondylitis patients, have been used. The LEI was developed explicitly for PsA and only has three extremity entheses that are examined bilaterally [52]. Indices used for PsA have more peripheral entheses, while indices used for ankylosing spondylitis have proportionally more axial sites (Fig. 3). Two notable studies compared entheseal indices. In ankylosing spondylitis patients receiving golimumab, the Berlin, University of California San Francisco (UCSF), and MASES indices were examined. Although all indices were able to show improvement, the UCSF Index had the highest and only statistically significant improvement but with a low effect size (Table 1) [53]. Interestingly, the UCSF Index had a higher number of axial entheseal sites (Fig. 3). The LEI, SPARCC, and MASES indices were examined in a placebo-controlled portion of a 12-week study of nonpsoriatic peripheral SpA patients comparing adalimumab and placebo. LEI and SPARCC performed better than MASES, which the authors postulated may be due to the differential number of axial/central versus peripheral entheseal sites (Figs. 2, 4) [54, 55].

Fig. 2
figure 2

Sites and distribution of the three commonly used entheseal indices. Lat lateral, Med medial, LEI Leeds Enthesitis Index, SPARCC Spondyloarthritis Research Consortium of Canada, MASES Maastricht Ankylosing Spondylitis Enthesitis Score

Fig. 3
figure 3

Distribution of type of entheses for major indices. Each anatomical area is only counted once. Axial sites include the spine, chest wall, and pelvis entheses. MEI Mander Enthesitis Index, UCSF University of California San Francisco Index, MASES Maastricht Ankylosing Spondylitis Enthesitis Score, PSA-MASES Psoriatic Arthritis-MASES, Mod-MASES modified MASES, LEI Leeds Enthesitis Index, SPARCC Spondyloarthritis Research Consortium of Canada

Table 1 Randomized-controlled trials of Biologics and Small molecules for Enthesitis in Spondyloarthritis
Fig. 4
figure 4

Enthesitis metrics in the adalimumab nonpsoriatic peripheral spondyloarthritis trial. a Prevalence by entheseal measure and any enthesitis ≥ 1. b Baseline mean score and confidence intervals by index used as well as all entheses (14 paired sites and spinous process L5). c Mean change of entheseal score and confidence intervals by index used. Asterisk denotes a statistical significance of at least p < 0.05. Enth entheses, LEI Leeds Enthesitis Index, MASES Maastricht Ankylosing Spondylitis Enthesitis Score, SPARCC Spondyloarthritis Research Consortium of Canada [54, 55]

As outlined above, the clinical evaluation of enthesitis depends on eliciting tenderness by palpation, which may be insensitive when compared with detection by imaging [56]. Palpation does not inform of underlying inflammatory or damage structural alterations in the enthesis [57]. In subjects with fibromyalgia, it may be difficult to assess the level of enthesitis due to allodynia, as well as the proximity of fibromyalgia tender points to enthuses [57]. Recent studies have attempted to use ultrasound to differentiate between enthesitis due to SpA and fibromyalgia [57,58,59]. Psoriatic subjects with fibromyalgia report a greater level of tenderness at entheseal sites. However, as an objective measure independent of patient reporting, ultrasound composite scores of entheses can distinguish between the groups. Changes at a single enthesis, or not using Doppler ultrasound to ascertain blood flow at the enthesis, could not differentiate between the groups [57,58,59]. Importantly biomechanical confounders, such as body mass index (BMI), affect the prevalence of chronic entheseal changes. Biomechanical confounders particularly affect PsA studies since the BMI is higher in these subjects. In contrast, in patients with IBD-related SpA who had a lower BMI, an ultrasound index using Doppler was able to differentiate between the groups with and without concurrent fibromyalgia [59]. Clinical indices for enthesitis are responsive, but it is unclear, beyond pain relief, if the response translates to true resolution and reversal or slowing the progression of the underlying morphological alteration of the entheses [60]. For example, it is unclear if structural changes to the enthesis continue to propagate, even after the inflammation subsides after treatment [61]. There is an opportunity for future studies to include concurrent imaging of entheses in therapeutic trials, but important biomechanical confounders also need to be considered [39].

6 Deciphering the Effect of Targeted Therapies on Enthesitis

A systematic literature search was conducted, with the assistance of a librarian, to identify placebo-controlled, randomized trials in patients with SpA that reported enthesitis as an outcome measure (see ESM for details). Articles that examined the effects on entheseal measures in a placebo-controlled phase of therapeutic randomized controlled studies were chosen (Table 1). Since enthesitis is a secondary outcome measure, pooled analyses of subjects with enthesitis from different clinical trials, but using the same therapeutic agent, were included when baseline and end of the placebo period metrics were reported for the subgroups. Long-term extension studies were excluded since no placebo comparisons would be available. A total of 45 articles were selected, of which 15 related to anti-TNF, 10 related to anti-IL-17, 8 related to to anti-IL-12/23 or IL-23, 2 related to anti-IL-6, and 1 related to anti-CD80/86 therapies. There were four articles on agents targeting the Janus kinase (JAK) pathway and five relating to apremilast, and there were four pooled studies, one each for ustekinumab, secukinumab, ixekinumab, and apremilast. There were six active comparator trials. Four had blinded phases, and one open-label study with ustekinumab was included to contrast with the other studies. Despite varying enthesitis tools and sample sizes, an attempt to describe the nuances of this heterogeneous data set will be undertaken undertaken in Sects. 6 and 7. A detailed abstraction can be found in Table 1. Cohen’s d effect size and confidence intervals were calculated to standardize comparisons where data were available or if cited in other published reviews. Generally, the effect size using a Cohen’s d statistic of < 0.2 is considered negligible, 0.2–0.5 is considered small, 0.5–0.8 is considered moderate, and > 0.8 is considered large. Confidence intervals can be used to gauge the reliability of the estimate [52, 62].

6.1 Anti-Tumor Necrosis Factor Therapies

Anti-TNF therapies are the cornerstone of biologic treatments of SpA and are the first-line biological therapies in recent society guidelines [63,64,65]. In the following section, we discuss trials that examined the effects on enthesitis after the administration of anti-TNF agents.

6.1.1 Adalimumab

Mease et al. conducted a double-blind, placebo-controlled study in moderate to severe PsA patients who were inadequate responders to nonsteroidal anti-inflammatory drug therapy. Adalimumab was compared with placebo using a 4-point enthesitis tool [66]. Although the adalimumab group had a greater resolution in enthesitis at the end of 24 weeks, the results were not statistically significant and may be a consequence of the limited enthesitis tool used in this study because it only included four sites—the Achilles tendon and plantar fascia bilaterally. The study reported that the bodyweight of subjects was evenly distributed between the treatment arms, but the specific BMI was not reported. Of note, the two entheseal sites chosen may be confounded by mechanical tendinosis, which may have also contributed to the lack of efficacy in the treatment group. Genovese et al. conducted a similar but smaller study of PsA subjects with inadequate response to nonbiologic disease-modifying antirheumatic drugs (DMARDs) and failed to show any difference between adalimumab and placebo using the same 4-point enthesitis index [67]. Adalimumab 40 mg every other week was compared with placebo over 12 weeks in patients with active nonpsoriatic peripheral SpA [54]. HLA-B27 prevalence was 61.5%, and 86.5% had enthesitis of more than one site. This study was unique in that it compared three entheseal indices as well as all the entheses used (Fig. 4a). Notable trends (Fig. 4a, b) included a high prevalence of enthesitis, as well as a trend for higher prevalence and baseline mean score with the use of a higher number of entheses. The SPARCC and LEI showed statistical improvement at week 12, but not the MASES Index (Fig. 4c). In a subsequent 12-week post hoc analyses of patients with baseline enthesitis, Guyatt’s effect size (mean change in the adalimumab group divided by the standard deviation of the placebo group) was larger for the LEI (− 1.07) and SPARCC (− 0.99) enthesitis indices than for the MASES (− 0.81) [55]. We calculated Cohen’s d effect size based on the mean difference between adalimumab and placebo at 12 weeks and noted that the LEI had the largest effect size followed by the SPARCC Index, with similar confidence intervals (Table 1). In contrast, the MASES Index had the smallest effect size and the confidence interval crossed zero. An important observation reported was of new-onset enthesitis reported at 12 weeks at sites that were negative at baseline. In the adalimumab group, new enthesitis ranged between 1.5 and 7.0% at most locations. The report of new-onset enthesitis highlights an essential bias in the many studies that analyze subjects with baseline enthesitis and hence exclude any new onset of enthesitis.

6.1.2 Etanercept

Gorman et al. conducted a study with etanercept, which had a total of 40 patients with active inflammatory ankylosing spondylitis, and used the modified Newcastle Enthesitis Index (NEI), now known as the UCSF Index [51]. The UCSF Index consists of 17 enthesitis sites with a score ranging from 0 to 17 [51, 53]. Patients had mild enthesitis based on low median NEI scores (Table 1). At the end of the 16-week placebo-controlled period, improvement of the median enthesitis score in the etanercept group compared with the placebo group was statistically significant. Mean scores and proportions of patients with enthesitis at baseline and 16 weeks were not reported. Due to the small size of the study, skewed data, and the fact that peripheral enthesitis was a secondary outcome measure, the main conclusion from this study may be that there is a chance that etanercept can improve peripheral enthesis but should be confirmed in more extensive studies. Dougados et al. studied etanercept monotherapy compared with placebo in 215 nonradiographic axial SpA patients [68]; 81% had MRI evidence of sacroiliitis, heel enthesitis prevalence was 41.9%, and the majority were male. At the end of 12 weeks, a small, statistically higher mean change in MASES in the etanercept group was reported in comparison with the placebo group. A small effect size confirmed this trend, with the respective confidence interval crossing zero (Table 1).

6.1.3 Infliximab

Infliximab was used in three controlled trials, with conflicting data. In a small study of ankylosing spondylitis patients comparing infliximab with placebo, the baseline prevalence of enthesitis was low. Small changes after 12 weeks were statistically significant using the analysis of covariation techniques, but the effect size was negligible [49]. Antoni et al. conducted a study comparing infliximab with placebo in active PsA patients [69]. Only 13 subjects in each group had enthesitis at baseline. At 16 weeks, there was a more significant reduction in the proportion of patients with enthesitis in the infliximab group compared with the placebo group. The statistically significant results should prompt caution due to the small sample of patients with enthesitis, as well as the use of the 4-point enthesis index. In contrast, van Der Heijde’s study comparing infliximab with placebo included 279 ankylosing spondylitis patients and used the Mander Enthesis Index (MEI) over 24 weeks [70]. There was a nonsignificant reduction in the median enthesitis score at the end of the placebo-controlled period, which may be due to the low prevalence of enthesitis as well as the use of the MEI. The MEI includes 66 enthesitis sites and may not have been a reliable measure between investigators.

6.1.4 Certolizumab

Mease et al. completed the only placebo-controlled trial on certolizumab in PsA patients, using the LEI [71]. Approximately two-thirds of patients had enthesitis with moderate LEI scores. The certolizumab 400 mg monthly and the 200 mg every 2 weeks groups had significantly higher enthesitis reduction than the placebo group. The effect size was moderate in both groups. The baseline degree of severity may have contributed to the level of change seen in the certolizumab arms.

6.1.5 Golimumab

There were five placebo-controlled studies with golimumab that revealed statistically significant results for improvement in enthesitis. In the first study, Kavanaugh et al. assigned PsA patients to golimumab 100 mg monthly, 50 mg monthly, and placebo [47, 72]. The MASES with 13 enthesitis sites and PsA-modified MASES (PsA-MASES), which included 15 enthesitis sites, incorporating the left and right plantar fascia, were used. Regardless of the enthesitis scale used for the same 405 patients, both the high- and low-dose golimumab groups had a statistically significant median percentage change from baseline and were between 43.6 and 52.4% (p = 0.001). Of note, the relative difference in scores was reported, not the raw scores at baseline or 24 weeks. However, the prevalence of enthesitis in the treatment groups compared with placebo was lower at week 24. The omission of raw scores makes it difficult to gauge the severity of disease in the responders versus nonresponders; however, Orbai et al. cited a moderate effect size for both doses of golimumab using the PsA-MASES [60].

A subsequent phase III study in patients with PsA examined golimumab intravenously versus placebo over a blinded period of 14 weeks [73]; 70% of these patients were receiving concomitant methotrexate. Of note, the baseline prevalence of enthesitis based on LEI was 76.25% (higher than other clinical trials), and the mean baseline LEI score was moderately high. At week 14, there was a statistically greater mean LEI reduction in the golimumab group compared with the placebo group. Proportions of patients with resolution of enthesitis were not reported in this study.

van Der Heijde et al. studied golimumab 50 and 100 mg subcutaneous doses compared with placebo in ankylosing spondylitis patients over a blinded period of 24 weeks [53]. This trial used the Berlin Index (12 sites), modified MASES (13 sites), and the UCSF Index (17 sites). Fifty percent of patients in the placebo group and 30% of patients in the golimumab 50 mg group dropped out. Only one of the six active comparators—the golimumab 100 mg monthly dose—achieved statistical significance with the UCSF Index; however, the calculated effect size was low. The low severity of enthesitis and loss of patients likely contributed to the lack of significance of improvements in enthesitis at 24 weeks when compared with placebo. The fourth study, by Carron et al., included 60 active early peripheral SpA patients and compared golimumab 50 mg monthly versus placebo using the modified MASES [48]. Sixteen of 40 patients in the golimumab group and 9 of 20 patients in the placebo group had enthesitis at baseline. At 24 weeks, 7 of 40 patients had enthesitis in at least one site in the golimumab group compared with 16 of 20 patients in the placebo arm, which was a statistically significant difference. Of note, the prevalence of the mean number of entheses affected was low, with two baseline entheseal sites in the placebo group and approximately one in the treatment group. In addition, the number of subjects with enthesitis was small. The MASES and modified MASES baseline and change values were not reported. Due to these limitations, it is difficult to generalize the results from this study. The fifth study, by Sieper et al., studied golimumab 50 mg or placebo subcutaneously in patients with nonradiographic axial SpA [74]. The MRI prevalence of sacroiliitis was 66.7%, HLA-B27 presence was 82.4%, and 42.8% were female. Baseline severity based on mean MASES was low. At week 16, there was a small, statistically significant improvement in the golimumab group, but the effect size was small.

Overall, the anti-TNF trials suggest a mild therapeutic effect when used for enthesitis in patients with ankylosing spondylitis, nonradiographic axial SpA, and peripheral SpA. In comparison, trials in PsA seemed to show efficacy and, where available, the effect size was moderate. One reason may be that the LEI, which was used most frequently, may be more responsive (see the Discussion section). In addition, the prevalence and severity of enthesitis were higher in the PsA studies, leading to a better chance of demonstrating the therapeutic effect. The other trials tended to have a low baseline prevalence of enthesitis, small populations, and, in one study, a large number of dropouts.

6.2 Anti-Interleukin (IL)-17 Therapies

6.2.1 Brodalumab

Drugs targeting the IL-17 axis include brodalumab, an IL-17 receptor antibody, while secukinumab and ixekinumab are antibodies to IL-17A. Brodalumab is currently US FDA-approved for psoriasis only, but it was studied in PsA by Mease et al. [75]. Two doses of brodalumab compared with placebo, over a 12-week blinded period, demonstrated articular response (ACR20) but did not show significant differences in LEI score between the treatment and placebo groups.

6.2.2 Secukinumab

Secukinumab has been studied in five PsA studies and five ankylosing spondylitis studies, but enthesitis data for ankylosing spondylitis was only available in the manuscript form of the phase II study. In the exploratory ankylosing spondylitis study, enthesitis frequency was too low to allow comparison between the treatment and placebo groups [76]. Post hoc analysis of the four phase III studies in ankylosing spondylitis was recently presented in abstract form [77]. Response to the MASES, as well as its peripheral components, were reported (Fig. 5). The authors also attempted to report on axial sites but these contained both lateral hip and elbow sites, therefore they were not strictly axial. At week 16, the end of the placebo phase of the studies, the approved dose of secukinumab 150 mg every 4 weeks showed superiority over placebo for MASES. Of note, this is one of the few studies that has shown a response to MASES, which may be due to increasing power by pooling the data.

Fig. 5
figure 5

Post hoc pooled analysis of four trials of secukinumab 150 mg (n = 355) versus placebo (n = 280) in ankylosing spondylitis subjects [77]. a Least squares mean changes in the MASES composite index, as well as in peripheral entheses (bilateral Achilles, lateral hip, and elbow entheses) and Achilles entheses. Only changes in the overall MASES Index were significant. b Percentage complete response for the MASES, as well as the peripheral sites and Achilles entheses. Statistically significant values are marked on both charts. MASES Maastricht Ankylosing Spondylitis Enthesitis Score, Enth entheses, LS least squares

The first phase II study of secukinumab compared 606 PsA patients randomized to placebo or intravenous secukinumab loading, followed by two dose groups of subcutaneous secukinumab [78]. The subjects had an overall 30% prior exposure to anti-TNF. The mean prevalence of enthesitis using a 4-point enthesitis index (bilateral lateral epicondyles and Achilles tendons) was 61.4%. The resolution of enthesitis was 47.5% in the pooled treatment arm, compared with 12.8% in the placebo arm. Although the data suggest a positive response, caution needs to be exercised since the content validity of this secondary outcome based on four entheseal sites may not be optimal. In the second study, PsA patients were randomized to subcutaneous placebo or three doses of secukinumab once weekly [79]. The mean prevalence of enthesitis was 63%, and 40% of subjects in the pooled treatment group experienced resolution of enthesitis at 24 weeks, compared with 21.5% in the placebo group. Subgroup analysis showed that the proportion of resolution of enthesitis was highest in the two highest doses of secukinumab. The overall mean LEI score was 3.1, indicating moderate severity of enthesitis. Mean differences in LEI did not undergo statistical inference testing, but the effect size calculated suggests a moderate effect for the 300 mg dose. The study stated that based on prespecified hierarchical analysis, improvements of enthesitis were not statistically significant. Although this study was exceptional in reporting the mean metrics of the LEI, it would have been instructive to report the data as box plots to understand if the proportion of patients that fully responded had low entheseal scores.

The third secukinumab study examined 414 PsA subjects allocated to subcutaneous secukinumab 300 or 150 mg after a five-dose loading regimen compared with placebo [80]. Overall, 32% were anti-TNF therapy-experienced. The enthesitis instrument was not specified in the study, but approximately two-thirds of the study population had a baseline prevalence of enthesitis. Only the response in the 300 mg group was statistically significant. There were insufficient data on the instrument and the mean raw scores of the groups to be able to confidently interpret the response of enthesitis to secukinumab in this study. The fourth secukinumab study examined 341 PsA subjects who were randomized to placebo or two arms of subcutaneous secukinumab, one with a five-dose loading regimen and the other without [81]; 24% had prior anti-TNF exposure. The enthesitis instrument was not specified in the study, but approximately two-thirds of the population had enthesitis. Of note, only the no-load arm response showed statistical significance. It is curious that despite the numbers of patients and the even distribution of baseline characteristics, the loading group, which had the more considerable dose exposure to secukinumab, did not have as robust entheseal responses as the lower dose. As in the earlier trial, the lack of detail regarding the enthesitis tool and the mean scores did not allow a clear understanding of the response.

Finally, the most recent study was the largest study of secukinumab, which enrolled 996 PsA patients [82]. Subjects were randomized to placebo or three groups using subcutaneous administration: secukinumab 300 mg with a loading dose, 150 mg with a loading dose, and secukinumab 150 mg without a loading dose; 29% had prior anti-TNF exposure. Approximately two-thirds of the population had enthesitis, and, in contrast to the preceding studies, the two secukinumab doses with loading regimens had a statistically significant resolution of enthesitis at 16 weeks. The results of this study are unusual in that the placebo group had the largest number of subjects, and although the percentage response appears low, the number of patients responding was large, and the prevalence of enthesitis decreased appreciably in the placebo group. The trend to bypass metrics of the outcome measure and report proportions solely does not allow readers to fully understand the nuances of the data or the validity and robustness of the response. Overall, there is a trend to the improvement of axial and peripheral enthesitis with the higher doses of secukinumab based on proportional improvements. The effect of loading doses was contradictory. We could only calculate the effect size for one study, and for secukinumab 300 mg dose, the LEI had a moderate effect size (Table 1).

6.2.3 Ixekizumab

Two phase III studies using ixekizumab evaluated enthesitis as secondary endpoints in PsA. The first study included adalimumab as an active comparator and is discussed in Sect. 7 as well as in the pooled analyses of the two trials by Gladman et al. [42]. Nash et al. evaluated 363 PsA patients with inadequate response to anti-TNF therapies [83]. The mean prevalence of enthesitis was 60.7% based on an LEI score of > 0. Subjects were allocated to subcutaneous placebo or a loading dose of ixekizumab 160 mg, followed by ixekizumab 80 mg every 4 weeks or every 2 weeks. At the end of the placebo-controlled arm at 24 weeks, there was no difference for the proportions of patients with enthesitis or least squares mean in the treatment group compared with placebo.

6.3 Anti-IL-12/23 Therapies

6.3.1 Ustekinumab

Two randomized, placebo-controlled studies of ustekinumab in PsA were identified. In the first trial, McInnes et al. studied 330 active PsA patients with inadequate response to conventional disease-modifying agents [20]. Subjects received two doses of subcutaneous ustekinumab or placebo. Based on a PsA-modified MASES score, the overall baseline prevalence of enthesitis was 71.7%. The enthesitis prevalence was likely higher than other trials as the PsA-modified MASES includes axial sites and, overall, has more entheseal sites. Also of note, 15.6% of subjects were receiving corticosteroids. At the end of the blinded period of 24 weeks, both dosing regimens of ustekinumab had a statistically significant reduction in the prevalence of enthesitis from baseline. Orbai et al. reported a small effect size for mean change in PsA-modified MASES based on data obtained from the sponsor (Table 1) [60]. A similar dosing regimen and placebo-controlled study was performed on 312 active PsA patients, of whom 58% were anti-TNF inadequate responders [19]. Based on a PsA-modified MASES score of > 0, the baseline prevalence of enthesitis was 70.8%. At the end of the 24-week placebo phase, both treatment groups had a statistically significant lower prevalence of enthesitis compared with placebo. However, the placebo response seems to be low or to have worsened, and the overall post-treatment ustekinumab prevalence of 72.9% signifies that the majority of the patients in this group still experienced enthesitis. A small effect size was reported by Orbai et al. for mean PsA-modified MASES at 24 weeks [60]. We excluded post hoc analysis of the above two trials by Kavanaugh et al. as only a subset of patients with spondylitis and peripheral arthritis was evaluated [84]. In contrast to the above studies, a recent small, open-labeled, observational study reported that amongst a group of 23 PsA patients with enthesitis at baseline, 82% of subjects had complete clearance of enthesitis, based on the MASES instrument, 24 weeks after administration of ustekinumab [85]. Although the duration of the ECLIPSA study was longer, most of the responses occurred in the first 12 weeks, with smaller changes in the second 3-month period. The baseline severity of enthesitis was low based on the median MASES score of 2 (out of 13). The BMI in the study was much lower, with a median BMI of 25.5. The study had a comparison arm of 24 subjects receiving various anti-TNF agents. Statistically significant lower enthesitis clearing responses were reported in the anti-TNF arm. Care needs to be exercised in declaring the superiority of targeting IL-12/23 over anti-TNF agents since ustekinumab dosing has a loading phase, whereas most of the anti-TNF therapies were subcutaneous without a loading dose. The anti-TNF group had various treatments that may not all have similar pharmacodynamic properties. In addition, more males in the anti-TNF therapy group may have been biased toward more severe subclinical enthesitis [86]. A more extensive, controlled, blinded study with uniform comparators may confirm or refute the relative efficacy of targeting IL-12/23 over anti-TNF for enthesitis.

6.4 Anti-IL-23 Therapies

6.4.1 Guselkumab

Guselkumab is a monoclonal antibody against p19, a subunit of IL-23, and is FDA-approved for moderate to severe plaque psoriasis. A phase II trial studied 149 active PsA subjects who were administered guselkumab 100 mg subcutaneously, or placebo, in a double-blind fashion [87, 88]. The baseline prevalence of enthesitis based on LEI was slightly higher in the guselkumab group compared with placebo (Table 1). At 24 weeks, a statistically higher proportion of patients in the guselkumab group had resolution of enthesitis compared with the placebo group, i.e. 56.6% versus 28. In addition, there was a moderate effect size for LEI mean change. Two phase III studies examined guselkumab in PsA subjects. Deodhar et al. reported on a cohort of PsA patients with moderately severe enthesitis, of whom 30% were anti-TNF therapy-experienced (Table 1) [89]. Two dose regimens of guselkumab were compared with placebo. The proportion of enthesitis changes were not statistically better in the guselkumab group. Data from secondary outcome measures were pooled and reported in the study using biologic-naïve patients [90]. The second study used a similar design, had biologic therapy-naïve PsA patients, moderate severity of enthesitis, and a baseline prevalence of enthesitis of approximately 70% (Table 1) [90]. In the pooled report, enthesitis resolution, as well as the least squares mean difference in LEI, were statistically significant for the two guselkumab groups compared with placebo [90]. These pooled group results suggest that guselkumab may be effective for peripheral enthesitis. Because of the contradictory results of the trials, a study with enthesitis as the primary outcome measure, coupled with an imaging measure, would help confirm this trend.

6.4.2 Risankizumab

Risankizumab is FDA-approved for moderate to severe plaque psoriasis, and targets the p19 subunit of IL-23. Phase II trial results reported enrollment of 173 PsA patients who were randomized to four arms of risankizumab and a placebo arm [91]. Overall, 64.7% of patients had enthesitis. At 16 weeks, the least squares mean change of the SPARCC enthesitis index from baseline was between − 1.4 and − 3.8, and the placebo group was − 1.2. When two arms with the highest cumulative doses were combined, the least squares mean change was − 1.7, which was not statistically significant from placebo. Interestingly, risankizumab did not show efficacy for primary endpoints in an ankylosing spondylitis trial compared with placebo [22]. As discussed in Sect. 2.2, one hypothesis to explain the lack of effectiveness is there may be an IL-23-independent mechanism driving inflammation at entheses.

6.4.3 Tildrakizumab

Tildrakizumab, an antibody against p19 of IL-23, is FDA-approved for moderate to severe plaque psoriasis [92]. At the time of writing, no publications addressing enthesitis were available.

6.5 Phosphodiesterase 4 Inhibitors

6.5.1 Apremilast

Apremilast is a phosphodiesterase 4 inhibitor that has been studied in five clinical trials. The first three trials, Psoriatic Arthritis Long-term Assessment of Clinical Efficacy (PALACE) 1 [93], 2 [94], and 3 [95], evaluated patients who may have had conventional DMARDs or biologic agents and were allowed concomitant conventional DMARDs. Analysis of data from a prespecified pooling of subjects with enthesitis from these three trials was reported by Gladman et al. [96]. In summary, patients with active PsA who had an inadequate response to fewer than four conventional or biological DMARD, or fewer than 2 two anti-TNF patients were recruited. In addition, PALACE 3 required one or more plaque psoriasis lesions ≥ 2 cm. Subjects were randomized to placebo and one of two doses of apremilast for a blinded period of 24 weeks; 63.3% of subjects had enthesitis based on a MASES score of > 0. The PALACE 1–4 studies, as well as the pooled analyses, are abstracted in Table 1. For the pooled analysis of the PALACE 1–3 trials, only the apremilast 30 mg mean MASES change was statistically lower compared with placebo at week 24, but the effect size was small. There was no statistical difference in the proportion of subjects with resolved enthesitis in the three groups. The lower comparative response compared with other agents may be due to the instrument used, which has more axial sites. However, the ustekinumab trial discussed in Sect. 6.3.1 in a similar population used the PsA-modified MASES and seemed to have a greater number of subjects with no enthesitis at 24 weeks. Since these are not head-to-head trials and are in different populations, care needs to be taken in interpreting these differences. In the PALACE 4 trial, Wells et al. evaluated 527 DMARD-naïve PsA subjects who were randomized to placebo or two doses of apremilast for a 24-week placebo period [97]. The MASES > 0 mean baseline prevalence of enthesitis was 65%. In patients who had baseline enthesitis, only the apremilast 30 mg dose showed a statistically significant reduction in the MASES score at 24 weeks, as well as the proportion of subjects with resolved enthesitis. The fifth apremilast study by Nash et al. examined apremilast 30 mg twice daily, or placebo, among 219 PsA patients, for a blinded period of 16 weeks [98]. In contrast to the other apremilast studies, the Gladman Enthesitis Index (GEI 0–6), which consists of nonaxial entheses, was used (ESM Table 1) [99]. Only about half of the total group had baseline enthesitis, and, in these patients, the mean reduction in the GEI was statistically significant, but not in the proportion of patients who had resolved enthesitis. The effect size was large for mean change of GEI and, compared with the pooled apremilast studies using MASES, underpinned the importance of the enthesitis instrument used. The evidence suggests that apremilast seems to show better efficacy at peripheral rather than axial enthesitis.

6.6 Janus Kinase Inhibitors

6.6.1 Tofacitinib

Tofacitinib is a JAK inhibitor that inhibits JAK1/3, and partially inhibits JAK2. Two phase III trials in subjects with PsA and one phase II trial in subjects with ankylosing spondylitis had available enthesitis data. One of the PsA phase III studies was a comparative efficacy trial and is described in Sect. 7. Gladman et al. report on a trial of PsA patients with inadequate response to anti-TNF therapy [100]. In the first 12 weeks of the placebo phase, patients were randomized to tofacitinib 5 mg twice daily, 10 mg twice daily, or placebo. There was moderate prevalence and severity of baseline enthesitis. Both tofacitinib groups had statistically significant improvements in the least mean change of LEI, with moderate effect sizes, but only the tofacitinib 5 mg group had a statistically significant resolution of enthesitis based on LEI. LEI and SPARCC indices were used in the two PsA trials, but only LEI results were published in the Mease et al. [110] and Gladman et al. [100] studies. However, SPARCC data were available in the supplement accompanying the integrative analysis of the two PsA trials by Nash et al. [101]. The pooled analyses reported significant proportional changes for both tofacitinib doses for LEI < 1, but mean changes were not given. For SPARCC, only the tofacitinib 10 mg group achieved a statistically significant reduction in enthesitis. Briefly, in a phase II study in ankylosing spondylitis patients, statistically significant improvements were seen in the Berlin Index from baseline to 12 weeks in both the tofacitinib 5 and 10 mg doses, but not the tofacitinib 2 mg dose [102]. Based on these studies, it appears that tofacitinib has efficacy for enthesitis, with stronger evidence for the tofacitinib 10 mg twice daily dose. The FDA-approved dosing is limited to tofacitinib 5 mg twice daily, especially due to the recent black-box warning for thromboembolism with higher doses of tofacitinib [103].

6.6.2 Filgotinib

Filgotinib is a JAK1/2 inhibitor. In a 16-week, phase II trial of filgotinib in PsA, patients who were inadequate responders to conventional therapies were randomized to filgotinib 200 mg daily or placebo; 51.5% of the filgotinib group achieved an LEI of zero, compared with 25.6% in the placebo group [104]. These results were reported as statistically significant.

No studies reporting data on the efficacy of enthesitis for upadacitinib or baricitinib were identified.

6.7 Therapies against T-Cell Co-Stimulation

6.7.1 Abatacept

Mease et al. reported on a clinical trial of 424 patients comparing abatacept 125 mg subcutaneous weekly with matched placebo [105]; 60% had prior exposure to anti-TNF therapies and 64% had baseline enthesitis based on the LEI. At the end of the placebo arm, at 24 weeks, the proportion of enthesitis resolution was not statistically significant. No other enthesitis metrics were reported. The authors noted that this trial had a proportionally higher number of subjects who were anti-TNF-experienced and that greater efficacy of joint responses compared with skin responses were noted. It may also mean that targeting T-cell co-stimulation may not be as efficacious for enthesitis.

6.8 Anti-IL-6

6.8.1 Tocilizumab

Sieper et al. reported on two short-term, placebo-controlled studies of tocilizumab in ankylosing spondylitis patients [106]. Overall, both studies failed to show efficacy for ankylosing spondylitis endpoints or MASES from baseline to week 12.

6.8.2 Clazakizumab

Clazakizumab is a monoclonal antibody against IL-6. It is the only IL-6 antagonist that has been tested in PsA and enthesitis. In a 24-week, placebo-controlled, phase II study by Mease et al., 165 patients were randomized to receive placebo or one of three doses of clazakizumab every 4 weeks [107]. Patients could be with or without methotrexate but were biologic-naïve. The SPARCC and LEI indices were used to assess enthesitis; 69% of patients were receiving methotrexate. Overall, the study did not show a dose response in its primary outcome measure (the ACR20) and although there were numeric differences, there were no statistically significant changes in the proportions of enthesitis from placebo to 24 weeks.

7 Controlled Efficacy Trials

With many therapeutic choices, comparative efficacy trials are essential to directly compare the efficacy, as well as adverse events, of the agents discussed in this review. An additional question addressed by one of the studies discussed in the following paragraphs is the role of methotrexate when added to biologic therapies in PsA. Four of the six comparative efficacy trials had a controlled, blinded phase. The open-label study comparing ustekinumab with various anti-TNF therapies was discussed in Sect. 6.3.1. The recent report on the comparison of ixekizumab with adalimumab was excluded due to its open-label design [108].

Mease et al. studied 417 anti-TNF therapy-naïve PsA patients with a baseline prevalence of LEI of 58% [109]. This double-dummy blinded trial compared subcutaneous placebo with adalimumab and subcutaneous ixekizumab 80 mg every 2 weeks or every 4 weeks, each preceded with a loading dose. Overall, approximately 55% of the population had enthesitis as well as a moderate baseline LEI score. After 24 weeks, there was a higher resolution of enthesitis in the two ixekizumab and adalimumab groups compared with placebo, and the change in the ixekizumab groups was statistically significant. The least square LEI mean difference was statistically significant for the ixekizumab every 2 weeks group. The study was not powered to compare ixekizumab with adalimumab. The effect size could not be calculated since the standard error of the mean was not available for the associated least squares mean. Gladman et al. reported on a post hoc analysis of the combined data of subjects with enthesitis or dactylitis from the two phase III ixekizumab trials [42]. Unfortunately, the results were only expressed in proportions and not in the change of LEI. In addition to the overall pooled results from the preceding study, this manuscript also reported that adalimumab and both ixekizumab groups had more subjects with resolved enthesitis at all three LEI entheseal sites at 24 weeks compared with placebo (Fig. 6).

Fig. 6
figure 6

Enthesitis response (percentage of patients with LEI < 0, nonresponder imputation) by the anatomical site at 24 weeks in the combined data from two ixekizumab trials [25]. LEI Leeds Enthesitis Index, PBO placebo, ADA adalimumab, IXEQ4W ixekizumab every 4 weeks, IXEQ2W ixekizumab every 2 weeks

Active PsA patients with inadequate response to conventional biologic agents were studied in a 12-month, placebo-controlled trial with a double-blinded 3-month phase [110]. Subjects were assigned to tofacitinib 5 mg twice daily, tofacitinib 10 mg twice daily, adalimumab 40 mg subcutaneously every 2 weeks, or placebo. After 3 months, adalimumab was replaced by placebo. The overall population had a moderate prevalence and degree of enthesitis. At 16 weeks, only the tofacitinib 10 mg dose had a statistically significant mean reduction in enthesitis score compared with placebo, with a high effect size. The study was not powered to elicit differences between the adalimumab and tofacitinib groups. Although the adalimumab group mean change was reported as not statistically significant, the effect size of the mean change at 12 weeks was moderate (Table 1). It is possible that if the blinded period ran for 6 months, the adalimumab arm may have had a chance to maximize its therapeutic response. Of note, due to thromboembolic events, the FDA has added a black-box warning against the use of tofacitinib 10 mg twice daily dosing [103].

Methotrexate alone or in combination with etanercept was compared with etanercept monotherapy in a PsA double-blind study [111]. The premise behind this study was to examine whether methotrexate provided synergistic or additive effects to etanercept, as well as a direct comparison with the monotherapy arms. This study is notable in that in the methotrexate arms, the median dose of methotrexate was 20 mg once a week. After 24 weeks of the blinded period, a significant finding was the superiority of etanercept for ACR20, ACR50, and ACR70 outcomes. In contrast to trials in rheumatoid arthritis patients, adding methotrexate to etanercept did not increase the efficacy of the synovitis endpoints. Given the limitations of analyzing secondary endpoints, all three groups improved, with confidence intervals suggesting a significant change from baseline (Table 1). There were no differences in enthesitis reduction or resolution between the three groups. In contrast to synovitis, methotrexate monotherapy seemed to be as efficacious for enthesitis as the etanercept arms but did not show a synergistic effect in the combination arm.

McInnes et al. compared secukinumab with adalimumab monotherapy in biologic-naïve patients [112]. The secukinumab arm received a loading regimen over 5 weeks, then 300 mg subcutaneously every 4 weeks versus adalimumab administered every 2 weeks; 58% of patients had enthesitis based on LEI and 74% of patients had enthesitis based on SPARCC. At the end of the 52-week trial, both LEI and SPARCC comparisons did not yield statistically significant enthesitis results between secukinumab and adalimumab. Interestingly, this trial did not provide information about the mean LEI or SPARCC scores at baseline and did not give any enthesitis indices mean change data at the end of the trial. Without this information, it is difficult to interpret the enthesitis data offered in this trial.

8 Discussion

In Sects. 6 and 7, we have extensively discussed individual agents as well as classes of drugs and their impact on enthesitis. Comparative efficacy can only be inferred indirectly since these are not head-to-head trials. At the heart of making a judgment is to see how responsive the enthesitis instrument was in the study. Several factors affect responsiveness. Responsiveness for entheseal indices has two sides of the coin—the degree to which the measurement changes, which is the most stringent, followed by proportions of subjects who achieve a target such as no enthesitis. As shown in Fig. 7, when comparing the degree of change, the majority of trials that used the MASES Index did not have a significant effect size. The lackluster performance may, in part, be due to acquisitional variability as well as underpowered secondary analysis due to low numbers of subjects with enthesitis, especially in studies with a large number of dropouts. In addition, it may mean that at the sites chosen in the MASES Index, either the prevalence of enthesitis is low or the measurement is unreliable. When peripheral sites are included, as in the PsA-MASES Index, the resulting responsiveness is not consistently better (Fig. 7). On the other hand, the responsiveness of the LEI seems to be consistent and statistically significant in all studies that had a calculable effect size (Fig. 7). In their seminal publication delineating the LEI, Healy and Helliwell also demonstrated a better effect size than the MASES Index, but confidence intervals were not given [52]. Correlative sonographic studies have shown a moderate correlation with inflammatory imaging features of enthesitis and a weak relationship with ‘damage’ features such as enthesophyte formation [113]. In fact, Healy and Helliwell have postulated that tender points may not be solely due to entheseal inflammation but may also be due to adjacent articular inflammation [52]. Ibrahim et al. reported that they could not distinguish between rheumatoid arthritis and PsA patients, using ultrasound of the LEI entheseal sites [114]. In summary, although the LEI seems to hit the sweet spot of feasibility and responsiveness, its construct validity as to whether it is measuring enthesitis is not clear. There is a need to add imaging studies such as ultrasound and MRI, including whole-body MRI, to future SpA studies.

Fig. 7
figure 7

Forest plot of Cohen’s d effect sizes with error bars indicating 95% confidence intervals calculated for studies where mean changes were available. The dotted line at zero indicates nonsignificance of effect. Comparisons are active drug versus placebo (see Table 1 for study details and citations). Effect size using a Cohen’s d statistic of < 0.2 is considered negligible, 0.2–0.5 is considered small, 0.5–0.8 is considered moderate, and > 0.8 is considered large. Confidence intervals assess the reliability of the estimate [52, 62]. PsA psoriatic arthritis, AS ankylosing spondylitis, NrAxSpA nonradiographic axial spondyloarthritis, pSpA peripheral spondyloarthritis, ADA adalimumab, ETA etanercept, CTZ certolizumab, GOL golimumab, SEC secukinumab, UST ustekinumab, GUS guselkumab, APR apremilast, TOF tofacitinib, LEI Leeds Enthesitis Index, MASES Maastricht Ankylosing Spondylitis Enthesitis, PsA Mod MASES psoriatic arthritis modified Maastricht Ankylosing Spondylitis Enthesitis

A large number of studies report proportions of subjects with resolved enthesitis. When an index with a smaller number of entheses is used, such as the LEI, the resolution of these entheses may not be representative of other entheses. In this study, Healy and Helliwell make a case that the LEI had the least floor effect when compared with the Mander Index; however, this was a clinical comparison and imaging is needed to verify that it genuinely correlates with the resolution of enthesitis at multiple locations.

The severity of baseline enthesitis also contributes to responsiveness—the higher the severity, the better chance a potent agent has to show change. In the axial SpA studies, the severity based on relative entheseal scores seemed to be low compared with PsA subjects.

The comparison of various agents for efficacy for enthesitis is problematic in that not only are disparate populations recruited but there are also a variety of factors that affect mechanical enthesopathy, such as age, BMI, activity, and disease duration. None of the studies analyzed the results by BMI. Imaging studies have shown that biomechanical confounders such as weight and physical activity can give rise to entheseal changes indistinguishable from inflammatory enthesis. Furthermore, some therapeutic agents are administered based on weight, while others have a loading schedule at the beginning of the study, hence biasing short-term results at 24 weeks. In addition, enthesitis is a secondary outcome measure in the majority of studies, and the analysis is conducted in a reduced population of subjects with baseline evidence of enthesitis, hence reducing power due to reduced numbers, which is further compounded by multiple statistical testing. To further complicate matters, a variety of instruments are used, and reporting of results is not comprehensive.

In general, evidence from this descriptive review, summarized in Table 1, suggests that anti-IL-6 and T-cell co-stimulation targeting may not be efficacious for enthesitis. Anti-IL-23 targeting had contradictory results. Rizankizumab was not efficacious for the treatment of axial or peripheral enthesopathy, but guselkumab was responsive in PsA enthesitis. It will be interesting to see the evolving evidence for other anti-IL-23 agents. If a consistent comparator is used, such as the LEI, then based on the effect size and confidence intervals, anti-TNF, anti-IL-17 agents, and JAK inhibitors have shown moderate efficacy. For other agents, the use of MASES may have resulted in the underpowering of their study, and hence failure to show effect. One of the recommendations by the authors would be that future studies pair clinical entheses that examine axial entheses with one that studies peripheral entheses.

One of the major findings of this review is the inconsistent reporting of results, which hampers a clear understanding of the data as well as a comparison between agents. Many studies report results as proportions of patients with enthesitis and do not report the mean changes in the entheseal instrument used. The entheseal instruments were not designed to report proportions of change. Patients with resolved enthesitis may have a milder disease or fewer sites of involvement. In order to judge durable response, not only should the mean change of the entheseal index be reported but also the proportional prevalence of enthesitis at baseline and end of the placebo-controlled period based on an entheseal score > 0. The effect size of the mean change would be helpful if consistently reported. To further understand what groups of patients respond, the analysis should include quartiles of degrees of baseline enthesitis and subsequent response, which may allow the reader to infer the severity of the group and to understand if the group with resolution of enthesitis had a milder disease. Finally, the majority of studies reported on patients who had baseline enthesitis. Only one group analyzed and reported new-onset enthesitis when the entheseal site was negative at baseline [55]. Subgroup analysis of only patients with baseline enthesitis introduces a bias of excluding subjects who may develop enthesitis during the study period. If proportions are to be reported, they should be reported as the whole group at baseline and end of the placebo period. Overall, clinical instruments evaluating enthesitis may not be perfect since they only record tenderness at the site, which may be influenced by pain sensitization or inflammation of adjacent articular structures. MRI or ultrasound imaging to evaluate enthesitis would help assess both inflammatory changes and chronic changes regarded as damage.

9 Conclusions

Enthesitis is a key pathological manifestation of SpA. It is associated with increased morbidity, and, in diseases such as PsA, is linked to a higher prevalence of erosive disease. Several clinical tools that are available to examine enthesitis vary, not only in the number of entheses chosen but also in their axial or peripheral distribution. In reviewing the modern targeted therapies for spondyloarthropathies, enthesitis is a secondary measure, and reporting of results in the majority of the studies is incomplete. Using LEI, anti-TNF, and anti-IL-17 agents, as well as JAK inhibitors, a moderate effect size is shown. The data for IL-23 targeting is contradictory. Other agents may not necessarily be inefficacious since the choice of the instrument may have hampered responsiveness. Future studies should ideally examine enthesitis as a primary outcome, use axial and peripheral entheseal indices, and be coupled with an imaging measure to understand which components of the entheseal structure are responsive and align with symptom relief. Imaging studies may also help assess damage to the enthesis, as well as correlation with function and clinical findings. With an increasing armamentarium, it is important to clarify if enthesitis responds and to what degree—transparency in reporting results will greatly help in this regard.