Ankylosing spondylitis (AS), a major member of the spondyloarthritis (SpA) family, is characterized by inflammatory spinal pain and occasionally by peripheral joint and entheses swelling [1]. Several placebo-controlled studies have shown that the clinical symptoms of some, but not all patients, respond very significantly to treatment with TNFα blockers such as infliximab [2]. However, these TNFα blockers are very costly. In most countries, their uses are either not supported by health providers or only under strict administrative control. It will be very useful if biomarkers are available so that practicing physicians can rapidly and reliably distinguish good from poor responders. Current gold standards to measure therapeutic response, such as the Assessment in Ankylosing Spondylitis (ASAS) International Working Group Criteria, are reserved only for investigators and not used by practicing physicians [3]. In addition, these response criteria rely heavily on subjective self-evaluation by the patients. The only blood tests currently used in clinical practice are the two acute phase reactants erythrocyte sedimentation rate (ESR) and C-reactive protein (CRP). Several candidate serum biomarkers have been reported to be affected by treatment with TNFα blockers, including metalloproteinase-3 (MMP-3), interleukin-6 (IL-6), vascular endothelial growth factor (VEGF), and molecules involved in cartilage turnover [49]. However, none of the publications address whether some of the biomarkers will be more useful when considered in combinations. Combinations of biomarkers are frequently used by rheumatologists, for example in assessing systemic lupus erythematosus. In addition, very few of the above publications address the practical usefulness of each biomarker in terms of sensitivity and specificity, which can be calculated by the Receiver Operator Characteristic (ROC) analysis [10]. These are the two factors which we will address in this particular paper.

We have recently reported on the clinical response of a cohort of AS patients treated with three infusions of infliximab. As in other studies involving infliximab and AS patients, there was a significant average clinical response 2 weeks after the first infusion, with decreases in ESR and CRP. However, in our cohort, ESR and CRP each by itself offered rather low sensitivity and specificity for distinguishing responders from non-responders [11]. In this current study, we added platelet count to our analysis, because platelet count has been reported as being increased in rheumatoid arthritis and also to be suppressed in response to treatment [12, 13]. In addition, we also measured serum concentration of a panel of 22 cytokines as well as MMP-3. MMP-3 is of special interest to SpA investigators because it has been reported to be highly expressed in the synovitis tissues and synovial fluids of SpA patients, and because both the serum concentration and the synovial tissue expression are suppressed by infliximab infusions [9]. In addition, serum concentration of MMP-3 in AS is a possible prognostic indicator of radiological progression [6].

We discovered firstly that ESR, CRP, and platelet count were each by itself inaccurate biomarkers for clinical response. However, when the three were considered in combination, their usefulness was markedly improved. Secondly, although serum MMP-3 was rapidly suppressed by infliximab, it lacked sufficient sensitivity and specificity to become a useful biomarker alone or in combination with other factors measured here. Thirdly, of the 22 cytokines we measured, IL-1α appeared to the most promising as a candidate biomarker. Lastly, to study if serum IL-1α, parallel to serum MMP-3, might have been generated in the arthritis tissues, we measured synovial fluids of 19 SpA patients. Because most AS patients do not have joint effusions, we measured instead synovial fluids from patients with undifferentiated spondyloarthritis (USpA). We discovered that indeed the synovial fluid concentration of IL-1α were much higher than the corresponding serum samples.

Materials and methods

Patient demographics, treatment and assessment protocol

The cohorts of normal subjects, AS patients receiving infliximab infusions and AS outside the infliximab study, have been reported previously [11, 14]. All AS patients fulfilled the Modified New York criteria for AS [15]. There were no USpA patients in the infliximab treatment cohort. For those AS patients not in the infliximab study, the Bath Ankylosing Spondylitis Disease Activity Index (BASDAI) values ranged from 1.85 to 9.6 [14, 16]. Baseline BASDAI values were >4 for those AS patients who participated in the infliximab study. Infliximab was infused intravenously in doses of 5 mg/kg at weeks 0, 2, and 6. Clinical assessment was carried out before each infusion and also at week 10. ASAS 20 or ASAS40 were defined as >20% or >40% improvement of at least three of the following four parameters: patient’s global assessment, spinal pain, Bath Ankylosing Spondylitis Functional Index, and inflammatory spinal pain, with an absolute improvement of at least one unit on a scale of 1–10 [17]. Patients were considered to have USpA if they fulfilled the European Spondyloarthropathy Study Group criteria but did not have one of the following: AS, psoriasis, ulcerative colitis, Crohn’s disease, or reactive arthritis [18]. Procedures and consent forms have been approved by the respective institutes.

Panel of biomarker candidates

CRP, platelet count, and Westergren ESR were measured by the clinical laboratory. The measurement of serum MMP-3 has been reported previously [14]. Cytokine assays utilized the Beadlyte Human 22-Plex Multi-cytokine Detection System (UpState, Temecula, CA, USA) as described previously [19]. The 22 cytokines were IL-1α, -1β, -2, -3, -4, -5, -6, -7, -8, -10, -12(p40), -12(p70), -13, -15, GM-CSF, IFNγ, TNFα, eotaxin, MCP-1, RANTES, M1P-1α, and IP-10. The range of concentrations measured for most of the cytokines was 2.3–5,000 pg/ml. The cost per assay for a single cytokine was less than USD $10.

Statistical approach

The softwares used for statistical analyses are SPSS, MedCalc, ClusterTree (Eisen Lab), and SAM in Multiple Experiment Viewer. Values were expressed as mean ± standard deviations (SD). Positive predictive values (PPV) and negative predictive values (NPV) were calculated according to equations provided by MedCalc Software (Mariakerke, Belgium).

To evaluate the usefulness of each candidate biomarker in monitoring infliximab therapy, we first used non-parametric tests to assess if there were statistically significant differences among samples at various stages of treatment. For those candidates that showed statistically significant differences, we evaluated if they correlated with clinical parameters including BASDAI, ASAS20, or ASAS40. To test if these correlations were clinically useful, we used the ROC analysis. ROC results were arbitrarily regarded as significant if AUC > 0.70. To test if combinations of parameters correlated with clinical parameters, we first submitted the data of all the candidates to multiple regression analysis. Parameters of significance or near significance were then combined in a logistic model and evaluated by ROC. Finally, to relax the stringency of the threshold, we also carried out comparisons by SAM and tested the pattern of expression by ClusterTree as described previously [19].

Results

Demographics

The 47 AS patients in the infliximab study were derived from the 63-patient cohort reported previously [11]. Forty of the 47 patients were male and 41 were HLA-B27 positive. The mean and SD of age were 32.6 ± 9.6 years. The mean and SD of duration of disease was 10.2 ± 7.1 years. Low back pain was present in nine of these USpA patients. The demographics of the 19 USpA patients are shown in Table 1. The demographics of AS patients not in the infliximab study and of the normal subjects have been reported previously [14].

Table 1 Demographics of USpA subjects providing both synovial fluid and serum samples

Clinical response of AS patients to infliximab

Forty seven AS patients were infused with infliximab at weeks 0, 2, and 6. Clinical responses, as measured by ASAS20 and ASAS40, were observed in weeks 2, 6, and 10 (Fig. 1, upper panel). Every single patient showed a decrease in BASDAI value at week 2 (p = 1.2E−17 compared to week 0 values), which continued to decrease at week 6 (p = 0.0002 comparing week 2 to week 6 values). There was no further decrease at week 10 (Fig. 1, lower panel).

Fig. 1
figure 1

Percent of patients reaching ASAS20 or ASAS40 (upper panel) and changes in BASDAI values (lower panel) in response to infliximab therapy. Values shown are mean values. Bars represent standard deviations. Infliximab infusions were delivered in weeks 0, 2, and 6

Changes in levels of ESR, CRP, and platelet count during infliximab therapy

ESR, CRP, and platelet count were measured before the first infusion and also at weeks 2, 6, and 10 (Fig. 2, upper and lower panels). Platelet count, ESR, and CRP decreased drastically 2 weeks after the first infusion (p < 0.0001 comparing week 2 to pre-infusion values). At week 2, decreases in at least one of these three parameters were observed in 89–91% of the patients. All three parameters remained lower than the pre-infusion values at weeks 6 and 10 (p < 0.0001 compared to pre-infusion values).

Fig. 2
figure 2

Changes in ESR, CRP (upper panel), platelet count, and MMP-3 (lower panel) induced by infliximab treatment. For weeks 0, 2, 6, and 10, the standard deviations for ESR: 23.4, 9.6, 5.0, 7.6; for CRP: 1.8, 0.2, 0.5, 0.2; for platelet count: 81, 67, 51, 49; for MMP-3: 24.5, 23.6, 20.3, 19

Testing if ESR, CRP, or platelet count correlate with concurrent clinical responses or predict subsequent clinical responses, as measured by ASAS20 or ASAS40

ESR, CRP, or platelet count measured at pre-infusion or weeks 2, 6, and 10 could not predict clinical responses at subsequent weeks. Although ESR, CRP, platelet count, and BASDAI values decreased in almost all patients after one infusion with infliximab, there was no correlation of ESR, CRP, or platelet count measured at any time point with the BASDAI values at the corresponding time point. Neither did changes in values of these acute phase reactants at weeks 2, 6, or 10 correlate with changes in BASDAI values at the corresponding time points.

However, when tested with multiple logistic regression, a positive correlation was observed between ASAS20 response and platelet count (p = 0.016), and to a lesser extent with ESR (p = 0.067). This will be addressed in a later section of this paper.

Changes in levels of MMP-3 induced by infliximab treatment

Serum MMP-3 decreased significantly and autoregressively with infliximab treatment (p < 0.001 comparing values at weeks 2, 6, and 10 to those at week 0) (Fig. 2, lower panel). Decrease in serum MMP-3 was observed in week 2 in 78.7% of the patients. The values of MMP-3 did not correlate with BASDAI values. Neither do serum MMP-3 values at any time points correlate with nor predict clinical responses as measured by ASAS20 or ASAS40 at the same time point or in later time points.

Changes in serum levels of cytokines induced by infliximab treatment

The following cytokines were measured in the same 47 AS patients at weeks 0, 2, 6, and 10 of infliximab treatment: IL-1α, -1β, -2, -3, -4, -5, -6, -7, -8, -10, -12(p40), -12(p70), -13, -15, GM-CSF, IFNγ, TNFα, eotaxin, MCP-1, RANTES, M1P-1α, and IP-10. When tested with paired t test, the only cytokine which decreased significantly at week 2 was IL-1α (p = 0.009). The decrease was even more at week 6 (p = 0.03 compared to week 2) and persisted to week 10 (p = .0009 compared to week 0). In contrast, IL-1β was low and unchanged during this period of observation (Fig. 3, upper panel). Infliximab also induced decreases in the serum levels of three other cytokines, but only at week 6: IL-3 (p = 0.003), IL-6 (p = 0.019), and IL-7 (p = 0.003) (Fig. 3, lower panel). No additional cytokine with significant changes were detected when analyzed with SAM with FDR set at 10%. Unlike ESR, CRP, and platelet count, differences in cytokine levels became insignificant when the p values were corrected for testing multiple samples.

Fig. 3
figure 3

Changes in serum levels of IL-1α, IL-1β (upper panel), IL-3, IL-6, and IL-7 (lower panel) induced by infliximab treatment. For weeks 0, 2, 6, and 10, the standard deviations for IL-1α: 77, 52, 23, 20; for IL-3: 89, 42, 33, 57; for IL-6: 32, 31, 31, 24; for IL-7: 45, 30, 15, 10

Testing if serum cytokines correlate with clinical parameters or clinical responses

Of all the cytokines, IL-1α was the only one which distinguished responders from non-responders. Serum IL-1α at week 6 correlated with ASAS40 responses in the same week. In ROC analysis, the AUC is 0.71 (p = 0.022) (Fig. 4). At the serum level of 18.6 pg/ml, the sensitivity and specificity are 84.9% and 53.8%, respectively (PPV = 0.80, NPV = 0.54).

Fig. 4
figure 4

Area under the curve of ROC plot testing if serum IL-1α in week 6 reflect ASAS40. The AUC is 0.71 (p = 0.022)

Testing the usefulness of combinations of biomarkers

When both platelet count and ESR at week 2 were regarded as a single combination in a logit model in ROC analysis for distinguishing patients with ASAS20 response, the result was significant (AUC = 0.75, p = 0.0013). The values of ESR and CRP did not follow a normal distribution. When their values were adjusted and platelet count, ESR, and CRP were entered into a logit model in ROC analysis, the discriminatory power was even higher (AUC = 0.804, p = 0.003) with a sensitivity of 81.3%; the specificity was 72.7% (PPV = 0.91, NPV = 0.54). Combinations with serum MMP-3 or with cytokines did not provide higher power of distinguishing ASAS20 response. No combinations provide power of predicting or distinguishing responders from non-responders in any other week, even when tested by clustertree analysis.

Cytokine levels in synovial fluid and serum samples obtained from the same patients

Using multiplex ELISA, the following 22 cytokines were measured in synovial fluid and serum samples obtained at the same time from the same 19 patients with USpA: IL-1α, -1β, -2, -3, -4, -5, -6, -7, -8, -10, -12(p40), -12(p70), -13, -15, GM-CSF, IFNγ, TNFα, eotaxin, MCP-1, RANTES, M1P-1α, and IP-10. Higher levels in synovial fluid compared to serum samples were observed with only three cytokines: IL-1α (751 ± 357 versus 111 ± 105 pg/ml, p = 0.0000001), IL-6 (3,744 ± 2,702 versus 73 ± 234 pg/ml, p = 0.0001), IP-10 (11,216 ± 9,834 versus 1,107 ± 1,580 pg/ml, p = 0.0004) (Fig. 5, y-axis in logarithmic scale). All three cytokines were still significantly higher in the synovial fluid compared to the serum samples when only the nine USpA patients with low back pain were taken into the calculation. Although there were differences in cytokine levels among serum samples, the p values were not statistically significant after correction for multiple testing.

Fig. 5
figure 5

Concentrations of IL-1α, IL-6, and IP-10 in synovial fluid and serum samples. Note: y-axis is in log scale

Discussion

The use of TNFα blockers such as infliximab has revolutionized the treatment of AS [2]. However, the medications are expensive and there is still a consistent percentage of patients who do not respond very well or would relapse during the course of therapy. It will be very useful if simple blood tests are available either to predict if a particular patient will respond well or to serve as an objective surrogate to monitor the response. Currently, the only blood tests being used in clinical practice are ESR and CRP. However, in our own study [11], ESR and CRP do not carry sufficient sensitivity or specificity for monitoring infliximab therapy. Clearly, other biomarkers are needed. Other than ESR and CRP, a considerable number of candidate serum biomarkers have been tested in AS. The most recent ones include IL-1β, IL-6, TGFβ, MMP-3, M-CSF, VEGF, amyloid A, YKL-40, C2C neoepitope, C-propeptide of type II collagen, and aggrecan 846 epitope. The candidate biomarker most frequently reported has been MMP-3 [4, 5, 79]. The serum levels of several of these candidate biomarkers have been reported to be suppressed by TNFα blocker therapies. However, none of these reports have addressed the practical usefulness of these biomarkers in terms of sensitivity and specificity.

Similar to some of the previous studies, the present study is also based on a cohort of AS patients with active disease who received three standard pulses of infliximab. The clinical response of these patients is similar to those reported in other cohorts [19]. For acute phase reactants, in addition to ESR and CRP, we also take into consideration the platelet count, because it has been reported to reflect clinical activities in rheumatoid arthritis [12, 13]. Besides adding platelet count values, the first major difference between our study and those of previous ones is that we calculate the statistical significance of our results using the ROC so that we can directly express the usefulness of our tests in terms of sensitivity and specificity. As in other studies, each acute phase reactant has limited usefulness in terms of sensitivity and specificity. The second major difference between our study and previous ones is that, instead of analyzing each acute phase reactant only separately, we also use regression analysis to analyze them as a combination. We discovered that, when considered as a combination, ESR plus CRP plus platelet count provide rather accurate assessment of response for the week immediately after the first infliximab pulse. It is probably intuitive for practicing clinicians to take several laboratory tests into account as combinations and not just individually. Here, we are able to validate this using statistical methods.

A third difference between our study and previous ones is that we screened a total of 22 candidate cytokines in the same serum samples. Similar to previous studies, there is suppression of serum IL-6 after the infusions with infliximab [8]. However, our analysis shows that changes in serum IL-6 are not sufficiently sensitive or specific in reflecting clinical improvement. Of the 22 serum cytokines we tested, only serum IL-1α provides enough sensitivity and specificity to reflect clinical improvement, and only at week 6. Similar to previous studies, we also observed dramatic suppression of serum MMP-3 even after a single infusion. Unlike previous reports, we tested if changes in serum MMP-3 are useful in clinical practice. Unfortunately, the sensitivities and specificities are insufficient for it to be of practical use.

Much of the serum MMP-3 in SpA patients are thought to be derived from the arthritis tissues, because synovial fluid concentrations are much higher than serum concentrations [14, 20, 21]. In this study, we also compare the synovial fluid concentrations of cytokines to those of serum concentrations in the same patients. We discovered that, similar to MMP-3, the synovial fluid concentrations of IL-1α are much higher than those of the serum concentrations.

The present study has several limitations. The duration of study was limited to 10 weeks. The study did not directly address whether patients who did not show statistically significant changes in acute phase reactants or IL-1α should stop additional infliximab infusions. Although the sensitivities and the PPV of the identified biomarkers were high, their specificities and NPV were low. Some of the biomarker associations were with ASAS20 response, which would be considered a low level of response in the use of biologics. As a matter of fact, instead of using only these types of criteria of responses, future studies on biomarkers should address individual parameters such as pain, stiffness, physical function, and radiological progression of disease.

Nevertheless, our results suggest for the first time that, as biomarkers for monitoring infliximab infusions in AS, combinations of acute phase reactants are more useful than each by itself. In addition, serum IL-1α is a promising candidate biomarker.