Background

In people with rheumatoid arthritis (RA), fatigue is a primary symptom that is frequent, highly variable, and often severe and disabling which impacts multiple aspects of health, work and social participation, and quality of life [1,2,3,4,5]. Fatigue in RA often results from systemic inflammation, limitation in joint mobility, and other factors including excess weight, poor or interrupted sleep, anxiety, depression, and stress. The experience of fatigue is often variable among and even varies from one person with RA to another. Thus, there are different approaches to conceptualizing fatigue, and little consensus on how to measure it [6,7,8,9].

The Patient-Reported Outcome Measurement Information System (PROMIS) was developed by the National Institutes of Health (NIH) to enhance measurement of physical, emotional, and social health across chronic conditions [10, 11]. PROMIS developers defined fatigue as an overwhelming, debilitating, and sustained sense of exhaustion that interferes with daily activities, work, and family or social roles [12]. The PROMIS Fatigue item bank includes > 90 items that ask about fatigue severity and its impact on day-to-day function. PROMIS Fatigue tools can be administered using computer adaptive testing (CAT) or with fixed-item Short Forms (SFs) containing 4, 7, or 8 items. The 7 and 8-item SFs contain non-overlapping items; those in the 4-item version are also contained in the 8 item SF. Items have been calibrated in the U.S. general population and some health conditions [13, 14]. Scores are reported using a T-score metric, where the population mean is 50 and the standard deviation is 10. We previously reported that the Fatigue CAT scores correlated moderately strongly with other indicators of disease activity, a fatigue visual analogue scale (VAS), and increased with worsening disease [15]. As many settings have practical constraints limiting widespread implementation of CATs, SFs may be preferable to administer in some situations [16, 17].

Before widespread use of Fatigue SFs can be recommended in specific patient populations, evidence is required that items are relevant and can capture the full range of patient experiences [18, 19]. Although items in the Fatigue item bank were debriefed according to PROMIS standards in 22 patients with a range of medical conditions [12], the degree to which they are comprehensive and easily comprehended is unstudied. To address this gap, we evaluated content validity by reviewing items in the PROMIS Fatigue SFs 7a and 8a with a diverse group of people with RA using online surveys and in-person cognitive debriefing interviews.

Methods

Study design

To assemble a sample with diverse sociodemographic and RA characteristics, we recruited participants from several sources. We invited RA patients receiving treatment and enrolled in ongoing observational studies at academic medical centers as well as individuals with self-identified RA affiliated with an online arthritis community to complete an online survey. The survey included SF items and additional questions about the comprehensiveness and comprehensibility of the items in relation to their fatigue experience. We conducted cognitive debriefing interviews in a separate group recruited from three academic arthritis centers. The survey was approved by the Johns Hopkins Institutional Review Board (IRB; 00059765). The debriefing was conducted with central oversight from the Johns Hopkins IRB (00059930), and at each site. Informed consent was obtained from all participants.

Participants

Online survey

A total of 57 adults, aged 18 years and older, who were fluent in English, had been diagnosed with RA, and were enrolled in an observational trial at the Johns Hopkins Arthritis Center, were consecutively approached by study staff; 52 agreed to participate. We also partnered with an online arthritis community (https://CreakyJoints.org/) to recruit participants with RA (hereafter referred to as the “online” participants) via email. Online participants were screened using inflammatory arthritis questions from the Connective Tissue Screening Questionnaire [20] which we adapted to include the 2010 ACR/EULAR RA criteria [21]. They also completed a DMARD checklist and answered questions to help identify and exclude those with a personal or family history of psoriasis or psoriatic arthritis (PsA) [22]. We tested our two-step screening approach using medical records of a convenience sample of 52 patients receiving care for RA and PsA from our general arthritis clinic and found this approach had 100% specificity for identifying people with RA.

The survey was conducted from April to September 2015. On the welcome page, we stated that the survey was voluntary and anonymous, and that completion was interpreted as providing informed consent. After providing information about age, sex, race/ethnicity, education, employment, year of diagnosis, and type of arthritis care provider (rheumatologist vs. other), online participants completed a self-assessment of their current disease activity and the two PROMIS Fatigue SFs (v1.0). They then rated how easy the questions were to understand and answer. We asked if the SF items covered the type of fatigue they experienced (current or past), and if additional questions were needed to fully capture their experience. To better understand how participants conceptualized their fatigue and selected responses, we asked if they distinguished overall fatigue from fatigue due to their RA; how much of their fatigue they attributed to RA (all, most, some, a little, none, can’t tell); and whether their answers would be different if asked to rate only RA-related fatigue (yes/no). We also asked if their fatigue level provided important information about how effectively their current treatment was controlling their RA.

Cognitive debriefing interviews of individual items

Cognitive debriefing participants were recruited from three academic arthritis clinics in Baltimore MD, New York NY, and Birmingham AL. Interviews were conducted from September through December 2015. Debriefing participants had the same sociodemographic and RA information described above and completed both SFs. They then were randomly assigned to be debriefed on items in either SF using a random numbers generator. Two trained interviewers (AB, KG) conducted face-to-face or phone interviews where they followed a script to guide participants to “talk-through” how they interpreted and answered each of the items [19]. They answered questions about how they thought about their fatigue when selecting responses whether they mostly considered the intensity or impact of their fatigue, or both, affected their response selection; and if their answers reflected their fatigue at its worst or its average over the past 7 days. Conversations were audiotaped and transcribed. We conducted a targeted and pragmatic qualitative analysis of the interviews to descriptively and thematically summarize the information.

Statistical approach

Descriptive statistics were calculated and t-tests and Chi-square were used to compare groups. Pearson and Spearman’s correlations were calculated to evaluate the relationships. Free text responses were summarized. The PROMIS Assessment Center Scoring Service was used to obtain IRT-calculated scores. As characteristics were similar among participants from the three academic centers, data were collapsed for subsequent analysis. Statistical analysis was performed using SPSS V24.0 and a p < .05 was considered statistically significant.

Results

Participants were from regions across the US, and were mostly female, white, and middle aged (Table 1). Most had attended or completed college and had lived with RA for a decade on average. Among online participants, 305 unique individuals were screened, and 95 (31%) were excluded mostly (58%) due to a personal/family history of psoriasis. Of 210 (69%) who met eligibility criteria, 200 (66%) completed the survey. Almost all online participants (193/200) reported their RA was managed by a rheumatologist.

Table 1 Participant characteristics

As compared with participants recruited from clinics (N = 82), online participants had more years of education, a shorter disease duration, were more likely to report they were disabled due to RA and used disability aids and devices, and had higher disease activity scores.

Survey responses

Mean PROMIS Fatigue SF scores were similar (p = .93) among clinic patients and reflected mild levels of fatigue (i.e., 53.2–54.8), but were significantly higher (i.e., > 1 SD; p = .000) in online participants (Table 1). Scores on the two SF were highly correlated (r values from 0.82 to 0.94; p values < 0.000) and the mean difference between the 7a and 8a was − 0.5 (95% CI − 1.04, 0.04). However, in 25% of cases, the difference between the two SFs exceeded 5 points (0.5 SD; range 12–25 points). Among those for whom the two fatigue SF scores differed by at least 5 points (i.e., discrepant scores), in 64% of individuals, the 7a score was lower than the 8a. Individuals with discrepant scores did not differ from those with similar scores by any sociodemographic or RA characteristic examined. The mean standard error was higher in the 7a than 8a (2.8 vs. 2.1, respectively). Notably, self-assessments of disease activity were also similar among clinic patients and significantly higher (p = .000; reflecting worse RA) in the online community.

Almost all (99%) respondents rated fatigue as an important symptom they considered when deciding how well their current treatment was controlling their RA (Table 2). The majority of participants (70–92%) reported that the 15 SF items in the 7a and 8a “completely” or “mostly” reflected their experience of fatigue. About 1 in 8 suggested that it would be useful to ask about other factors including sleep (“my fatigue causes me to sleep more”), the impact of fatigue on tasks requiring attention and memory (“brain fog”), and how fatigue affected work, social, and intimate relationships. One participant noted that the fear of becoming tired often caused them to limit activities.

Table 2 Participant responses to questions about fatigue by source

Almost all (≥ 94%) reported they were able to distinguish a general sense of fatigue from the fatigue they attributed directly to RA. Most (83%) of the online respondents (who reported significantly greater fatigue) as compared with 33% of clinic participants attributed “a lot” or “all” of their fatigue to their RA. When selecting a response, 72% of clinic patients and 87% of the online participants indicated that they were specifically describing fatigue they attributed to their RA or a combination of RA and general fatigue. Further, when asked to rate only the fatigue they attributed to their RA, few (up to 11%) indicated they would have provided a different response. Correlations between patient self-assessments of disease activity and scores on the SFs were moderate (r values from 0.64 to 0.65; p = .000; Fig. 1).

Fig. 1
figure 1

Relationship between patient assessment of rheumatoid arthritis disease activity and PROMIS Fatigue 7a short form scores

Item-level debriefing

A diverse sample of participants was recruited from academic arthritis centers in Baltimore (n = 12), New York (n = 12), and Birmingham (n = 8) (Table 1). Interviewees also indicated that SF items were easily understood and no specific concerns were raised regarding question structure, stems, or recall period (data not shown).

Across items in both SFs, ≥ 85% rated the individual questions as “somewhat” or “very relevant” to their fatigue experience (Table 3). In the Fatigue 7a, 25% rated one item “not at all” relevant (FATIMP21; “How often were you too tired to take a bath or shower?”). From 19 to 25% of participants rated three items that asked about being bothered, having to push oneself, and trouble finishing things due to fatigue as “not at all relevant” to their experience of fatigue.

Table 3 RA patient perceptions of relevance of PROMIS Fatigue short form items (N = 32)

To gain insight on fatigue attributions, we asked participants how they thought about their fatigue. On average, 59% of participants rated their average fatigue over the past 7 days, while 19% rated their fatigue at its worst. The remainder were not sure how they thought about their fatigue (13%) or used other heuristics (9%). Most (72%) reported they considered how their fatigue interfered with day-to-day life; 19% considered the intensity/severity of the fatigue they were experiencing, and one person (3%) said they considered both impact and severity. The mean (SD) scores of participants who anchored their ratings on fatigue impact/interference were significantly lower than those anchoring on severity/intensity or both [46.8 (9.2), 57.4 (9.0), 60.7 respectively; p = .01].

Discussion

A growing body of evidence suggests that fatigue in RA is a prevalent and debilitating symptom of RA which significantly impacts social and work participation, well-being, and quality of life. As definitions of fatigue vary, it remains unclear how to precisely and reliably measure this symptom in RA [23,24,25,26]. To our knowledge, this study is the first to assemble diverse samples and compare the responses of individuals recruited from an online community with those from specialty clinics to assess the relevance and representativeness of items in the two PROMIS Fatigue SFs currently available. Individual interviews were conducted in a separate sample to gain greater insight into how patients conceptualize and report on their experience of fatigue. Our results suggest that the contents of both the PROMIS Fatigue 7a and 8a SFs are relevant and representative of the full range of fatigue that people with RA experience. Almost all (87%) participants indicated that each PROMIS Fatigue SF asked about relevant aspects of fatigue. A similar proportion (88%) indicated that either version can capture the full spectrum of fatigue associated with RA. In one-on-one interviews, ≥ 75% judged the items as “very” or “somewhat” relevant to their experience of fatigue.

Most participants indicated they could distinguish between fatigue they attributed primarily to their RA from a general tiredness resulting from other causes such as interrupted sleep or tending to young children. Among those with more active RA, fatigue was worse, and participants were more likely to attribute their fatigue directly to their RA. The attribution of symptoms such as fatigue is influenced by sociodemographic and psychological factors and disease knowledge [27], and often varies between patients and providers [28, 29]. In turn, symptom attributions influence coping, medication concerns, adherence to treatment, treatment response, and reporting of side effects to treating physicians [29, 30]. RA patients often attribute symptoms such as fatigue to less serious and non-modifiable causes, especially in the absence of joint swelling, and in turn are less likely to seek medical attention [27, 31]. In a recent study in the Netherlands, in an open-label transition to a biosimilar, one quarter of patients who voluntarily switched asked to return to the originator mainly due to subjective experiences, including fatigue, that they attributed to the new drug [32].

When assessing the fatigue, most of the participants indicated that they conceptualized it in terms of its impact rather than severity suggesting it is the interference in day-to-day life more than the symptomatic experience that may be most salient and disabling. Interestingly, those who rated impact vs. severity also had significantly lower mean fatigue scores. The two Fatigue SFs that are currently available contain non-overlapping items querying both severity and impact to produce a single score [13, 33].

Although fatigue has been recommended as a core outcome measure in RA trials [34], there has been little consensus on how to measure it. Measures commonly used that conceptualize fatigue as a unidimensional factor include the fatigue severity VAS, the 4 SF-36 [35] vitality items, and the 13-item Functional Assessment of Chronic Illness Therapy (FACIT) Fatigue Scale [36]. Cella et al. compared the psychometric properties of the SF-36 Vitality subscale and FACIT in > 600 RA patients and found robust FACIT was better able to discriminate across the range of fatigue [37]. When comparing the fatigue VAS with longer scales in nearly 8000 people with RA, Wolfe concluded the VAS showed similar or better responsiveness than longer scales [38]. While unidimensional scales are generally brief and easy to complete, a single-item fatigue VAS is less reliable than multi-item measures [26] and may be too general. Although concerns have been raised about conceptualizing fatigue and energy as a single dimension, as in the SF36 items, a recent report using sophisticated bi-factor modeling supports a factor structure of one general (vitality) and two group (energy and fatigue) factors [23]. Examples of scales that assess fatigue as multiple domains in RA include the Multidimensional Assessment of Fatigue (MAF) scale, the Bristol Rheumatoid Arthritis Fatigue Multidimensional Questionnaire (BRAF-MDQ) [39], and the BRAF-NRS which consists of 3 single-item scales that assess severity, impact, and coping ability. BRAF is a multidimensional RA-specific fatigue measure developed from patient focus groups thus having strong content validity [39]. Using item response theory analysis, Oude Voshaar and colleagues [25] compared the psychometric characteristics of the BRAF-NRS, the SF36, and the BRAF-MDQ and concluded that while all measured a common underlying domain of fatigue severity, they differed considerably in precision and targeting. Whereas the SF36 items provided optimal information in individuals with mild fatigue, the BRAF-MDQ offered precision among those with high levels of fatigue [25]. The three single-item BRAF-NRS scales were not recommended due to the restricted measurement range. In the general population, the PROMIS Fatigue SFs offer maximum information in scores from 45 to 75 reflecting none to very mild feelings of tiredness (-.5 SD) to very severe and sustained exhaustion (+ 2.5 SD) [40]. This range is highly relevant to people with RA. Tables are available that link scores on PROMIS Fatigue with FACIT-Fatigue and SF36 Vitality at prosettastone.org.

The PROMIS family of measures were developed to reliably and precisely assess a broad range of health domains that directly impact quality of life across chronic conditions [10]. We previously reported evidence of construct validity and relevance of several CATs assessing symptoms and impacts that people with RA have identified as important to them [15]. Our results from the survey and cognitive debriefing interviews suggest that the Fatigue SF contains items that are easy to understand and relevant to people with RA. Scores were similar on both SFs, and the primary difference between them appears to be length. PROMIS developers suggest that the 8a is more precise, whereas the 7a optimizes measurement across the full range of the domain [40]. Our results support the use of both versions in people with RA. Notably, the Fatigue 7a was recently used to evaluate overall symptom burden and quality of life in patients with myelofibrosis treated with ruxolitinib [41] with favorable results about fatigue included on the US product label. Rigorous methodology and a best practices approach [42] has been used to translate the PROMIS Fatigue SFs into more than 20 languages, with additional efforts underway [43].

Strengths of this study include the use of qualitative and quantitative methods to examine the content validity of the SF items in a diverse sample of patients with a wide range of disease symptoms and levels of fatigue. Purposive sampling was used to ensure good representation across age, sex, disease duration, disability, education, and geographical residence. There are also limitations. We were unable to confirm the diagnosis of RA in online participants but our screening approach increases confidence that these individuals had inflammatory arthritis that required DMARD therapy. Only participant perceptions of disease activity were available for the online group. Most participants were female, white, and well educated. Individual who agreed to participate may not have similar fatigue experiences with other RA patient populations. We did not specifically ask if patients attributed RA-related fatigue to their disease and/or medications used to control inflammation and pain.

In summary, in a socio-demographically and geographically diverse sample of people with RA from across the United States, fatigue was a common and important concern that affected day-to-day function and quality of life. Our results suggest that PROMIS fatigue SFs are relevant and can measure across the continuum of fatigue experienced by people with RA. PROMIS Fatigue SFs generates a single summative score that can be easily interpreted and widely applied in clinical and research settings. These data contribute to growing evidence supporting the use of PROMIS measures to reliably and precisely evaluate fatigue and other symptoms in people with RA in clinical trials and care settings.