Introduction

Crohn’s disease (CD) and ulcerative colitis (UC), collectively known as inflammatory bowel diseases (IBD), are chronic inflammatory disorders of the gastrointestinal tract with heterogeneous disease presentation and natural history. Genome wide association studies have identified over 160 risk loci associated with IBD [1] with distinct genetic profiles associated with specific IBD phenotypes [2, 3]. In addition, a number of disease characteristics including location and behavior, age of onset, and disease duration are associated with different outcomes and response to therapy [46]. As a result, the management of IBD will continue to move toward personalized care accounting for the patient’s unique genetic, clinical, and environmental background [7].

Persistent, active intestinal inflammation in IBD can severely impair quality of life and lead to increased hospitalization and surgical rates [8, 9]. Ideally the goal of therapy is deep remission, a combination of endoscopic and clinical remission, which is associated with improved patient outcomes [10]. However, endoscopic evaluation is not always feasible due to its invasive nature, burden to patients, expense, the risk of complications, and the possibility the site of active disease may not be reached. As a result, noninvasive biochemical markers, clinical activity indices (CAIs), and health-related quality of life (HRQOL) scores have been used as surrogate markers for disease remission.

While a number of prior studies have demonstrated good correlations between clinical activity or HRQOL scores and endoscopic disease activity or inflammatory markers, particularly in UC [1113], others have shown fair to poor correlation in assessing luminal CD [1416]. The subjective nature of these indices may explain their poor performance in some studies, but the degree to which disease heterogeneity contributes to this variability is unclear. Although a unique set of CAIs have been developed for pediatric-onset IBD [17, 18], similar disease activity indices are used in adults regardless of age of onset or disease characteristics.

As management of IBD moves toward more personalized care, the validity of current disease activity and HRQOL indices according to various disease phenotypes may be critical in development of unique patient-reported outcome measures. We therefore sought to examine the validity of current CAI and HRQOL indices as they relate to endoscopic disease activity according to various disease phenotypes. We hypothesize that the correlations between clinical activity or HRQOL scores and endoscopic disease activity vary according to disease phenotypes.

Materials and Methods

Study Population

Starting in 2004, adult patients, age ≥18 years, with a diagnosis of CD or UC were recruited in the Prospective Registry in IBD Study at Massachusetts General Hospital. At the time of recruitment, detailed information on disease characteristics according to the Montreal classification, lifestyle factors including smoking, body mass index, and dietary factors, along with other comorbid conditions was collected. For this study, patients with available clinical and endoscopic activity indices measured within 1 month of each other were eligible for primary analyses. The institutional review board at the Massachusetts General Hospital approved this study.

Assessment of Clinical and Endoscopic Activity Indices

In a subset of participants, information on clinical activity indices—the Harvey–Bradshaw Index (HBI) for CD and the Simple Clinical Colitis Activity Index (SCCAI) for UC—and quality of life scores in the form of the Short IBD Questionnaire (SIBDQ) were collected by research coordinators and further confirmed by the treating gastroenterologists. We also retrospectively reviewed endoscopic examinations performed within 1 month of collection of clinical activity indices. Two physicians calculated simple endoscopic scores CD (SES-CD) and Mayo endoscopic score for UC through review of endoscopic images and reports.

Assessment of Disease Characteristics and Other Covariates

Information on age of diagnosis, disease behavior (inflammatory, penetrating, stricturing, and perianal disease), disease location (ileal, colonic, and ileocolonic) for CD or disease extent (proctitis, left-sided, and extensive) for UC, previous surgeries for CD, duration of disease, and ever use of IBD-related medications (steroids and biologics) were collected and confirmed by review of medical records and further verified by primary gastroenterologists. The retrospective nature of the study prevented accurate assessment of medication use at the time of clinical activity and quality of life scoring. In addition, we obtained inflammatory markers (ESR and CRP) at the time of ascertainment of clinical activity indices.

Statistical Analysis

We used Spearman rank correlation to calculate the relatedness between endoscopic scores and CAI scores, quality of life (QOL) scores, and CRP and ESR levels. We defined correlations <0.3 as poor, 0.3–0.7 as good, and >0.7 as excellent [19]. We also calculated correlations according to the strata defined by age of diagnosis (17–39, 40–59, ≥60), disease behavior, disease location or extent, disease duration (<10 or ≥10 years), inflammatory markers (above or below upper the limit of normal), and surgical history. The stratified correlation coefficients were then compared using a Fisher r-to-z transformation [20, 21]. All analyses were conducted using R statistical software. All p values were two-sided, and the threshold for statistical significance was set at 0.05. However, we used a Bonferroni correction to adjust for multiple comparisons in analyses that compared correlations according to disease phenotype with the threshold for statistical significance set at 0.017.

Results

In total, 282 CD and 226 UC cases with available information on endoscopic disease activity and CAIs were eligible for our study (Table 1). The mean age at diagnosis for CD and UC patients was 30 and 33 years, respectively. At the time of enrollment, the mean duration of disease for CD and UC was 9 and 10 years, respectively. Among CD patients, 72 % had some ileal involvement while 28 % had isolated colonic disease. Over half of CD participants had either stricturing or penetrating disease at the time of enrollment. Among UC patients, 55 % had extensive disease and just 11 % had proctitis.

Table 1 Clinical characteristics of participants according to disease type

In a small validation study (N = 20) of endoscopic scoring, the inter-observer correlations for Mayo endoscopic score and SES-CD were 0.95 and 0.98, respectively. In addition, in a subset of patients where endoscopic activity scores also were documented by treating gastroenterologists, the correlations between endoscopic activity scores reported by the treating gastroenterologists and those reported by the two reviewers were consistently greater than 0.90.

There was poor correlation between CD simple endoscopic score (SES-CD) to SIBDQ (r = −0.16) and HBI (r = 0.18) and good correlation to ESR (r = 0.30) and CRP (r = 0.39) (Table 2). We observed good correlation between the Mayo endoscopic score to SIBDQ (r = −0.56), SCCAI (r = 0.55), ESR (r = 0.33), and CRP (r = 0.32).

Table 2 Comparisons of clinical activity indices, quality of life scores, and inflammatory makers with endoscopic activity indices

Disease Location and Behavior

Although the correlations between HBI and SES-CD appeared to be better in ileal CD compared to colonic or ileocolonic disease, these comparisons did not reach statistical significance (Fig. 1). Similarly, the correlations between SES-CD and SIBDQ according to disease behavior were not significantly different (Fig. 2). We observed a similar pattern when comparing SES-CD to inflammatory markers according to disease location.

Fig. 1
figure 1

Correlations between endoscopic activity and indirect disease measures according to Crohn’s disease location

Fig. 2
figure 2

Correlations between endoscopic activity and indirect disease measures according to Crohn’s disease behavior

We also explored the correlations between CAI or QOL indices and inflammatory markers to SES-CD according to disease behavior and observed no significant differences (Fig. 2). Specifically, the SES-CD had poor correlations with HBI, SIBDQ, and ESR and good correlations to CRP regardless of disease behavior. Finally, presence of perianal disease was not associated with any differences in the correlations between CAIs, SIBDQ, or ESR/CRP and SES-CD (Fig. 3).

Fig. 3
figure 3

Correlations between endoscopic activity and indirect disease measures according to presence or absence of perianal disease

In UC, compared to extensive disease, we observed a better correlation between SCCAI and Mayo endoscopic score in left-sided colitis (r = 0.73 vs. 0.45, p comparison = 0.005) (Fig. 4). The correlations between SIBDQ and Mayo endoscopic score were good across different disease locations with no statistically significant difference between the correlations. CRP and ESR values appeared to correlate better with endoscopic score in extensive colitis compared to left-sided colitis (p comparison = 0.006 and 0.05, respectively).

Fig. 4
figure 4

Correlations between endoscopic activity and indirect disease measures according to ulcerative colitis extent

Age at Diagnosis

In UC, SCCAI and SIBDQ had good correlations with the Mayo endoscopic score and were not significantly different according to age of onset (S1). In CD, we observed consistently poor correlations between clinical and HRQOL to endoscopic activity indices for patients <60 years old and good correlation for those ≥60 years with all comparisons being statistically insignificant (S1).

Surgical History and Disease Duration

In CD, the correlations between CRP and SES-CD were good regardless of surgical history (S2). Similarly, correlation between HBI and SIBDQ to SES-CD was poor regardless of prior surgery (S2).

We also explored whether the correlation between clinical and endoscopic activity indices differ according to disease duration (S3). HBI and SIBDQ had poor to good correlation to SES-CD, while inflammatory markers correlated well with SES-CD regardless of disease duration. In UC, SCCAI correlates better among patients with less than 10 years of disease duration versus longer duration of disease (r = 0.61 vs. 0.37, p comparison = 0.04). SIBDQ had good correlation with endoscopic disease activity, and this was independent of disease duration. Inflammatory markers had good correlation with endoscopic disease activity, irrespective of disease duration.

Discussion

In this large cross-sectional study, we show that in CD, HBI and SIBDQ overall correlate poorly to endoscopic disease activity regardless of age at diagnosis, disease location and behavior, duration of disease, and surgical history. In UC, although there is a good correlation between SCCAI and SIBDQ to endoscopic disease activity overall, the correlations appeared to be stronger with left-sided UC and individuals with shorter duration of disease. This is the first study to substantiate the use of HRQOL and clinical activity indices among different IBD phenotypes according to previously validated endoscopic activity markers.

Despite the variability of clinical presentations and outcomes of CD and UC patients, for the most part in our study the validity of noninvasive measures of disease activity as they relate to endoscopic disease activity is not significantly different according to disease behavior and location and age of onset. This may suggest a couple of different possibilities. First, clinical activity and HRQOL scores in IBD may be insufficiently sensitive to distinguish among different IBD phenotypes. Second, endoscopic disease activity may not be an accurate measure of complete disease remission, particularly with regard to CD where active disease may be located beyond the reach of endoscopic evaluation. In UC, histological remission is associated with less disease relapse and lower colorectal cancer risk when compared to endoscopic remission, suggesting that microscopic remission may be a more accurate predictor of disease outcome [22]. In CD, a transmural process, mucosal biopsies may not be a sufficient indication of “true remission.”

Traditionally, biochemical parameters and other clinical activity scores have validated HRQOL and other clinical activity scores [23, 24], though more recently correlation with mucosal assessment has become standard [25, 26]. This appears to be a more clinically relevant comparison as the goal of IBD therapy is mucosal healing, which in turn is associated with durable remission and decreased risk of surgery [27]. Consistent with previous work [14, 15, 25], in our study, SIBDQ and HBI correlated poorly with endoscopic disease activity in CD. When examining the relationship between SIBDQ and endoscopic disease activity, Casellas et al. [25] reported a correlation of −0.31, similar to the −0.17 in this study. However, they did not use validated endoscopic disease scores for either CD or UC. The SES-CD and the Mayo endoscopic score in UC, used in the current study, appear to be most reliable measures of endoscopic disease assessment [28, 29]. In UC, our finding was consistent with others showing good correlation between SIBDQ and SCCAI with endoscopic disease activity [30].

HRQOL scores and CRP have been evaluated according to various disease characteristics in CD. A European study of 189 CD patients found no difference between patients based on the Vienna classification when measured by the Psychological General Well-Being Index, the EuroQol, and the IBD Questionnaire [31]. SIBDQ has been measured according to disease phenotype in patients developing CD after ileal-pouch anal anastomosis for UC, and no difference was found between various phenotypes [32]. In CD, after excluding fibrostenotic disease, elevated CRP levels have been more commonly associated with colonic and ileocolonic disease than ileal disease [33], though this finding has not been consistent in all IBD studies [34]. Additionally, there was no statistically significant difference between high sensitivity-CRP and disease behavior [33]. Our study did not show that patient heterogeneity effected HRQOL and CRP. Additionally, unlike previous analysis, we also evaluated the validity of CRP and HRQOL scores among older-onset (age ≥60) disease as recent data suggests that the elderly may have a less aggressive disease course [35]. Variations in prior studies in CRP may be related to the wide ranges of cutoff values used in IBD [36, 37], and currently there is no accepted cutoff value. A cutoff value of 10 mg/L has been proposed for high sensitivity-CRP at diagnosis as a predictor of disease exacerbation [33].

Recently, the FDA has mandated that clinical trial endpoints move away from using CAIs and toward endoscopic disease assessment and patient-reported outcome measures [38]. These measures are expected to address the gaps of previously utilized outcomes and highlight the patient experience. Although several IBD patient-reported outcome measures have been proposed [39, 40] or served as endpoints in clinical trials [10], none have been validated or correlated with endoscopic disease activity. Kappelmen et al. used the patient-reported outcome measure information system (PROMIS), developed by the National Institute of Health, in a longitudinal and cross-sectional study. In analyzing over 10,000 patients, they found a statistically significant association between PROMIS and increasing SIBDQ, short Crohn’s disease activity indices, and SCCAI. Additionally, short disease duration (<1 year) was associated with the highest anxiety and depression in CD and the highest anxiety and fatigue in UC independent of disease activity. Also, for most outcome measures, age >60 was associated with better outcomes than patients between the ages of 18–30 [40]. In this study, often the differences between outcomes among the different phenotypes were small suggesting patient-reported outcome measures may be more sensitive to subtle differences between patient phenotypes than currently used HRQOL and disease activity measures.

Various PROMIS scores are associated with disease activity and QOL scores in IBD [40]. However, since QOL and disease activity scores often do not correlate well with endoscopic findings, the accuracy of patient outcome measures in predicting active intestinal inflammation is unclear. If further studies reveal little to no correlation between patient-reported outcomes and mucosal healing, appropriate medical management in patients will require an effective balance between these measures.

We acknowledge several limitations. First, our participants were from a single tertiary IBD center and therefore our results may not be generalizable to other populations. However, the characteristics of participants in our study including age of diagnosis, disease location, and rates of complications were similar to previous natural history studies of large population-based cohorts [41, 42]. Second, the number of participants in some of the categories of disease location, behavior, and age of diagnosis was small limiting the precision of our estimates. However, to date our study represents one of the largest and most comprehensive studies attempting to examine the validity of clinical indices according to various disease characteristics. In addition, we were able to examine the validity of clinical indices in older-onset patients (age ≥60 years), an advantage over prior studies [43]. Further studies should be done to confirm our findings. Third, over 90 % of patients in the study were Caucasian, so our results may not extend to other races or ethnicities. Though a more aggressive disease course in African-Americans was suggested previously [44], further analysis has revealed that these differences were likely due to socioeconomic status and access to health care [45]. A recent study showed no difference in perioperative complication rates between Hispanics and non-Hispanics [46]. Furthermore, HRQOL scores and the HBI in CD have been validated in African-Americans and remain similar to Caucasians during the course of disease [45]. Finally, we acknowledge the subjective nature of retrospective evaluation of endoscopic disease activity. However, in our small validation study, the correlations between the two reviewers and between the reviewers and the endoscopists were excellent.

Accurate identification of active intestinal disease in patients with varying phenotypes is a key to early and effective therapy. While quality of life and clinical activity scores reflect endoscopic disease activity in UC, they perform poorly in CD. The utility of HBI, SCCAI, and SIBDQ in patients appears to be the same according to various disease phenotypes. Further studies are necessary to validate other indices and newly developed IBD patient-reported outcome measures based on various phenotypes.