Introduction

Psoriatic arthritis (PsA) is a chronic inflammatory joint disease involving peripheral joints, axial skeleton, enthesitis and dactylitis [1], and translating into a potential heavy burden for patients [2]. In PsA, the radiographic joint damage is characterized by a combination of changes, including erosions, fluffy periostitis, pencil-in-cup deformities, acro-osteolysis, and ankylosis [3, 4]. The involvement of the distal interphalangeal joints (DIP) of the hands is a typical feature [5].

The assessment of the radiographic damage is still one of the fundamental outcome measures in inflammatory arthritides, being a worldwide accepted measure of articular damage in PsA [6,7,8]. Different scoring systems, developed for rheumatoid arthritis (RA), have been subsequently modified for PsA. These instruments include the modified Sharp/van der Heijde Score (mSvdHS) or the modified Steinbroker global scoring method [9, 10], while the Psoriatic Arthritis Ratingen Score (PARS) is the unique scoring system developed afresh for PsA [11].

All of these radiographic scoring systems are based on semiquantitative assessment and their lowest common denominator is the large time to be completed. According to the Group for Research and Assessment of Psoriasis and Psoriatic Arthritis (GRAPPA), mSvdHS is the optimal tool to use in randomized controlled trials, but the most appropriate tool for use in longitudinal observational studies is yet to be determined [12].

The PARS is the only scoring method that focuses on bony proliferation (BP). Proliferative lesions are pathognomonic for PsA and therefore are considered the most specific PsA radiographic features [13].

In view of the aforementioned issues, this study was carried out to develop and provide the initial validation of a new feasible radiographic score, called Simple Psoriatic Arthritis Radiographic Score (SPARS).

Materials and methods

Design and study population

From June 2016 to May 2018, posteroanterior radiographs of the hands and feet were collected from consecutive adult PsA patients (fulfilling the Classification Criteria for Psoriatic Arthritis criteria) [13], belonging to the outpatient clinic of a tertiary rheumatologic center. For the purposes of this study, only patients with a predominant peripheral joint involvement (defined by the presence of at least one or more tender and/or swollen joint of hands or feet) have been enrolled (excluding subjects with an exclusive psoriatic spondylitis).

In this cross-sectional study, patients underwent a clinical, laboratory, and radiographic assessment. In detail, the clinical and laboratory assessment included the tender joint count (TJC, 0–68 joints), the swollen joint count (SJC, 0–66 joints), the C-reactive protein (CRP) values (in mg/dl), the patient global assessment (PGA) of disease activity and the numerical rating scale (NRS) of pain (both in a 0–10 scale). With these parameters was computed the Disease Activity in PSoriatic Arthritis (DAPSA) score. Patients were also investigated for the presence of rheumatoid factor (RF) and anti-citrullinated protein antibodies (ACPA).

The Compact Disc - Read-Only Memory of the radiographs of each patient has been collected within the 4 weeks following the clinical evaluation, and stored in a blinded manner.

The preliminary validation of the SPARS proceeded with three stages: analysis of the reliability of the scoring system, determination of convergent construct validity compared to other traditional scoring systems, and evaluation of feasibility assessing the average time needed to score according to each scoring system.

Reading strategy and radiographic scoring systems

For each patient, digitized radiographic images were stored in a blinded manner. Two independent readers, a radiologist (MC) and a rheumatologist (FS) trained in scoring systems (both with a 20-year-long experience in radiological scoring systems in inflammatory arthritides), blinded to the clinical features of the patients, analyzed the images. The readers performed a pre-study training scoring 20 radiographs (outside the study) in line with each scoring system. Radiographs were scored independently according to the mSvdHS [9], the PARS [11], and the SPARS. The scoring systems are briefly described below and summarized in Table 1.

Table 1 Overview of the three radiographic scoring systems compared in this study

To determine reliability, all sets of hands and feet radiographs were scored by both the readers using the three scoring systems in random order. To evaluate SPARS inter-rater reliability, the images were scored 2 weeks later from the first assessment. Feasibility was estimated using the average time needed to score SPARS compared to the other two scoring systems.

Modified Sharp-van der Heijde score (mSvdHS)

mSvdHS is based on the Sharp–van der Heijde method. The original scoring system evaluates erosions and joint space narrowing (JSN) of joints of hands and feet in RA [14, 15]. The proposed method for PsA evaluates erosions, JSN, subluxation, ankylosis, gross osteolysis, and pencil-in-cup lesions [16,17,18,19]. Erosions are assessed in 20 joints of hands and wrists: ten DIPs/ interphalangeal joints of the thumbs (IPs), ten metacarpophalangeal joints (MCPs), two first metacarpal bones, two radial and ulnar bones, two multangular units (trapezium and trapezoid combined) and in 12 joints of the feet (ten metatarsophalangeal joints (MTPs) and two IPs of the big toes. JSN, subluxation, ankylosis, gross osteolysis and pencil in cup are assessed in the hands in 10 DIPs/IPs, ten MCPs, second, third, fourth, and fifth carpometacarpal joints, two multangular units, two capitate-navicular-lunate joints, two radiocarpal joints, ten MTPs, and two IPs of the big toes. The maximum score for erosions is 5 in the joints of the hands and 10 in the joints of the feet. Scores for erosions are as follows: 0 = no erosions; 1 = discrete erosions; 2 = large erosions not passing the midline; 3 = large erosions passing the midline. A combination of the above scores lead to a maximum of 5 for a whole joint in the hands, and 5 at each site of the joint (for the entire joint a maximum of 10) in the feet. The JSN scoring is: 0 = normal; 1 = asymmetrical or minimal narrowing up to a maximum of 25%; 2 = definite narrowing with loss of up to 50% of the normal space; 3 = definite narrowing with loss of 50–99% of the normal space or subluxation; 4 = absence of a joint space, presumptive evidence of ankylosis, or complete luxation. Gross osteolysis and pencil in cup are scored separately. If present, these lesions are scored with the maximum score for both erosions and JSN. The maximum possible score for erosions is 200 for the hands and 120 for the feet; the maximum possible score for JSN is 160 for the hands and 48 for the feet. Finally, the maximum possible score is 528.

Psoriatic Arthritis Ratingen Score (PARS)

PARS is the only method specifically developed for PsA [11]. Joints of hands and feet are scored for erosions and BP. A total of 40 joints of hands and feet are scored: 8 DIPs of both hands, 8 proximal interphalangeal joints (PIPs) of both hands, 2 IP of the thumbs, 10 MCP, both wrists, 8 MTPs (form II to V joints) on both sides and 2 IP joints of the first toes. The PIPs and DIPs of the feet, although frequently affected in PsA, are not included because of poor visibility and poor reproducibility at different time points in many cases. PARS includes a destruction score (DS) and a BP score. In the DS the grading on a 0–5 scale is based on the amount of joint surface destruction: 0 = normal; 1 = one or more erosions with an interruption of the cortical plate of >1 mm with destruction of the total joint surface up to 10%; 2 = 11–25%, 3 = 26–50%; 4 = 51–75%, 5 > 75% joint surface destruction. The BP score sums up the lesions indicative of osteoproliferation typical of PsA (para-articular spikes, supracortical bone formation, diaphyseal thickening, enlargement of the bone compared to the opposite side or to the radiographs). The grading is 0–4: 0 = normal, 1 = BP of 1–2 mm or bone growth < 25% of the original size (diameter), 2 = BP 2–3 mm or bone growth 25–50%; 3 = BP > 3 mm or bone growth > 50%; 4 = bony ankylosis. The DS (0–200) and the BP (0–160) are summed in the total score (0–360).

Simplified Psoriatic Arthritis Radiographic Score (SPARS)

The SPARS definition was obtained through a consensus analysis, involving three radiologists (MC, LC, and AG) skilled in musculoskeletal imaging, and four rheumatologists (FS, EDD, MDC, and MML) with clinical experience on PsA and radiographic scoring systems. SPARS assesses the same joints of the PARS in an easier way: the grade of the combination of erosions and BP of the PARS is simply replaced by the sum of joints with erosions and the number of joints with BP (Figs. 1 and 2). Similar simplifications have been already applied for the radiographic scoring systems in the rheumatologic literature [20]. In SPARS, a joint is defined as eroded (score 1) if one or more erosions with an interruption of the cortical plate > 1 mm (PARS grade 1 of DS) can be observed. JSN is present (score 1) if at least an asymmetrical or minimal narrowing up is detectable (mSvdHS grade 1). BP is considered (score 1) if a proliferation of 1–2 mm or a bone growth < 25% of the diameter (PARS grade 1 of BP) are detectable. For each joint, the score assigned ranges from 0 (no structural damage) to 3 (coexisting presence of erosions, JSN, and BP). This kind of scoring is applied to the 40 joints of the PARS. Therefore, the maximum total score of SPARS is 120.

Fig. 1
figure 1

Joints included in the Simplified Psoriatic Arthritis Radiographic Score (SPARS)

Fig. 2
figure 2

Example of scoring in a 63-year-old male with moderate changes in hands (a) and feet (b). For better clarity of details, only the left side is depicted. Some examples for the erosion (E), joint space narrowing (N), and osteoproliferation (P) are shown. In this case, the SPARS score of left hand+left foot score is 30

Statistical analysis

Demographic data were analyzed using descriptive statistics. Mean with standard deviation (SD) and median with interquartile ranges were used to describe these differences.

Measurement error was estimated by inter-rater reliability in 93 radiographs and by rescoring 50 radiographs to evaluate intra-rater reliability. Differences are reported as recommended using both intraclass correlation coefficients (ICCs) and visually by plotting the difference in change of scores against the mean change by both raters for determination of the smallest detectable difference (SDD) [21, 22]. The ICC is considered excellent if above 0.75, from fair to good if between 0.4 and 0.75, and poor below 0.4. The SDD is a statistical method to define measurement error based on the 95% limits of agreement, as described by Bland and Altman [23]. The SDD is reader- and sample-specific, and represents the smallest change in score that can be discriminated from the measurement error of the scoring method. Using the SDD as the threshold level for a definite change in score ensures that the changes observed are not due to reading variability. Convergent construct validity was investigated by the correlation (using Pearson’s rank correlation test). Patients were divided into three groups according to duration of the disease: group 1, < 5 years; group 2, ≥ 5 and < 10 years; and group 3, ≥ 10 years. Comparisons between paired data (according to gender) were analyzed with Student’ t test and with the one-way analysis of variance (ANOVA). P values less than 0.05 were considered significant.

Data were processed with the MedCalc Statistical Software, version 18.0 (Ostend, Belgium), for Windows XP.

Results

Of the 140 consecutive patients with hands and feet inflammatory involvement, the entire (radiological and clinical) evaluation was available in 105 (75%) (71 women and 34 men) patients. The mean (± SD) age of patients was 50.2 (± 12.1) years. The mean (± SD) disease duration was 10.1 (± 8.4) years. Median clinical and laboratory parameters of disease activity were: SJC 6 (range, 0–11), TJC 8 (range, 0–31), DAPSA score 26.8 (± 17.3), CRP 0.7 mg/dl (range, 0.1–8.7). No patients were RF or APCA positive.

The mean (± SD) scores for the SPARS, mSvdHS, and PARS were 50.69 (± 24.40), 246.89 (± 114.31), and 144.73 (± 71.84), respectively. The median values (95% CIs for the median) were 51.50 (43.48 to 57.00), 245.00 (217.90 to 275.01), and 156.00 (133.90 to 167.01), respectively (Table 2). Figure 3a–c depicts the estimates of central tendency and distribution of the three scoring systems. All the methods showed a normal distribution.

Table 2 Descriptive statistics for SPARS, mSvdHS, and PARS in the whole cohort (105 patients)
Fig. 3
figure 3

Histograms showing the score distribution and central tendency of the three scoring methods. Score distribution and central tendency of the Simplified Psoriatic Arthritis Radiographic Score (SPARS) (a), modified Sharp-van der Heijde Score (mSvdHS) (b), and Psoriatic Arthritis Ratingen Score (PARS) (c). The bar on the left of each graph represents the number of subjects with a score of 0 (floor effect); the bar on the right represents the number of subjects with a maximum possible score (ceiling effect)

The SPARS intra-rater reliability was excellent for both readers (ICCs 0.945 and 0.976). Inter-rater reliability was highest for SPARS (ICC = 0.884, 95% CIs 0.852 to 0.898), followed by PARS (ICC = 0.869, 95% CIs 0.842 to 0.889), and mSvdHS (ICC = 0.819, 95% CIs 0.802 to 0.838). The SDD for the average of the PARS scores by the two readers was 8.0. SPARS inter-rater reliability is shown in Fig. 4.

Fig. 4
figure 4

Bland–Altman plot of the difference scores against the mean score of radiographic score. Interrater Simplified Psoriatic Arthritis Radiographic Score (SPARS) result for the 105 patients. Solid horizontal lines represent the mean difference; broken horizontal lines represent 2 standard deviations (SDs) of difference from the mean

Regarding the convergent validity, SPARS strongly correlated with mSvdHS (r = 0.926, p < 0.0001) (Fig. 5a) and PARS (r = 0.904, p < 0.0001) (Fig. 5b).

Fig. 5
figure 5

Scatter plots with regression line illustrating the correlation between the Simplified Psoriatic Arthritis Radiographic Score (SPARS) and the modified Sharp-van der Heijde Score (mSvdHS) (a) and the Psoriatic Arthritis Ratingen Score (PARS) (b)

Allocating patients in line with disease duration, SPARS showed higher scores in long-lasting PsA (p < 0.001) (Fig. 6), while the gender did not explained a significant variance (p = 0.34).

Fig. 6
figure 6

Histograms showing the variance of the Simplified Psoriatic Arthritis Radiographic Score (SPARS) according to disease duration

The feasibility was evaluated measuring the mean time needed to score the radiographs of hands and feet for each patient, according to the three scoring systems. The most time-saving scoring system was SPARS (4.5 min, range, 3.2 to 6.9 min), followed by PARS (10.1 min, range, 8.6 to 12.4 min), and by mSvdHS (14.4 min, range, 11.3 to 17.8 min).

Discussion

The Steinbrocker global scoring method was modified for PsA through the inclusion of the DIPs [10]. It assesses global changes and gives an overall measure of joint damage from 0 to 4. The severity of radiological involvement is scored by assessing the degree of soft-tissue swelling, osteopenia, JSN, malalignment and bony ankylosis. It is quickly performed, however it was only used in case-control studies [24,25,26,27].

The modified Sharp score (MSS) evaluates the same joints as in the original scoring system, including the DIPs from 2 to 5 of both hands. The computation is quiet complex: for the erosion score, the reader has to consider together the original instructions for grades 0 to 5 of the Sharp score (counting the number of discrete erosions) and of the definitions of the PARS for RA. Every 20% of joint surface destruction leads to an increased grade of the score [28].

The mSvdHS scores the same joints and definitions as seen in RA, with the addition of the DIPs joints of hands. Even the calculation of this score is characterized by a high degree of complexity.

A high degree of complexity, next to a reduced availability (but with the benefit the active inflammatory lesions), characterizes the scoring systems adopting magnetic resonance imaging, such as the Psoriatic Arthritis Magnetic Resonance Imaging Scoring System (PsAMRIS) [29].

The PARS has been developed on the Rau and Herborn modification of the Larsen Score [30]. This scoring system considers DIP joints of the hands. All together, it includes 40 joints, and all the joints are scored separately for BP and erosions.

In 2014, Tillett and colleagues investigated the feasibility, reliability, and sensitivity to change of four radiographic scoring systems for PsA (respectively Steinbroker, MSS, mSvdHS, and PARS) [31]. They demonstrated that the mSvdHS is most reliable and sensitive to a change scoring system. Secondly, the Steinbrocker method is the most feasible tool, but loses the sensitivity of the mSvdHS (the soft tissue swelling element of the Steinbrocker method is a relevant source of variability). Thirdly, the smallest detectable change of the PARS is similar to that of the mSvdHS and MSS, but is faster to be scored.

Although the mSvdHS appears as the most suitable tool in terms of sensitivity, it does not include BP, which is a very common, specific, and sensitive-to-change feature in PsA [12, 32,33,34]. The findings of a Swedish cohort demonstrated that BP contributes more than erosions to the observed changes over a period of 5 years [35]. The PARS is the only scoring method that focuses on osteoproliferation. Alternatively, the PARS does not include JSN, which seems to be more important than erosion in saving the function. In this respect, Kerschbaumer and colleagues analyzed 363 patients enrolled in the GO-REVEAL study and obtained mSvdHS from radiographs performed at baseline, after 24, 52, and 104 weeks [36]. In line with previous reports, they observed a significant association of disability with joint damage [37, 38]. Importantly, like in RA, JSN is a surrogate of cartilage damage more associated with functional impairment than erosions [39].

Tillet and colleagues recently proposed a novel radiographic scoring system, the Reductive X-ray Score for Psoriatic Arthritis (ReXSPA) [40]. The ReXSPA, built through a reductive analysis of existing composite scores, requires the assessment of 22 joints of hands and feet, evaluating erosions, JSN and BP. This composite score has a similar sensitivity as the mSvdHS, the most sensitive method developed, but is briefer than the modified Steinbrocker, the most feasible method.

In the present study, we propose a novel scoring system that includes the hallmarks of PsA. SPARS encompasses erosions, JNS, and BP in the same joints of PARS, but without grading lesions, making it quicker to perform and easier to learn. Moreover, compared to ReXSPA, the number of joints included is larger, reducing the risk to underestimate the articular damage.

SPARS showed good agreement between assessors. The inter- and intra-rater reliability estimates are comparable to those of other scoring systems [13]. The SDD for the average of the SPARS scores by the two readers was 8.0. This value is close to that of the Steinbrocker, but lower than SdvHS and MSS. The SDD of a scoring method can be used as a threshold level for definite change [41].

Another important issue is to establish the validity of a scoring system. Since there is no true external gold standard, a new method can be compared to the traditional scoring systems for PsA, such as mSvdHS and PARS. In this regard, SPARS correlated strongly with mSvdHS and PARS.

In terms of feasibility, the time required to apply each method differed considerably (from 4.5 min for SPARS to 14.4 min for the mSvdHS). In daily clinical practice, this aspect can not be neglected.

The major limitations of the present study are the moderate sample size and the single-center recruitment. Moreover, the responsiveness of structural components of the SPARS was not tested. Future studies with a long-term (preferentially multicentric) design, with the inclusion of various subgroups of patients (i.e., early disease, new treatment with biologics or small molecules), and with the possibility to compare radiographic data with other imaging techniques are needed for external validation of the new scoring system.

In conclusion, SPARS represents a new scoring system that takes into account not only the destructive changes but also osteoproliferation. SPARS is reliable both in terms of intra-rater and inter-rater reliability comparable with standard radiographic scoring systems used in PsA. SPARS is easily performed and can be suitable for application in clinical practice or in study cohorts.