Introduction

Given the prevalence of fatigue symptoms among parents in the weeks and months after birth, there is a need for brief, reliable, and valid measures of postpartum fatigue to facilitate accurate diagnosis and guide treatments. Currently, many fatigue scales are in use, but few have been validated in postpartum samples (e.g., Giallo et al. 2014). The Fatigue Severity Scale (FSS; Krupp et al. 1989) is a widely used scale for measuring fatigue-related impairment in clinical populations, with demonstrated reliability, validity, and sensitivity to treatment (Whitehead 2009), yet to be validated in the postpartum period.

The aim was to (1) examine psychometric properties of the FSS among women admitted to a residential early parenting program for management of unsettled infant behaviors (UIB) and (2) identify potential improvements to the psychometric properties of the FSS. The sample of women seeking help with UIB (e.g., persistent crying, resistance to soothing, and sleep difficulties) used in this study is ideal for assessing the FSS in a help-seeking postpartum population, because many women in this predicament commonly experience fatigue, depression, and anxiety symptoms (Fisher et al. 2002, 2011).

Method

Participants and procedure

Participants were recruited from the Masada Private Hospital Mother Baby Unit (MPHMBU) in Melbourne, Australia, on admission to a non-psychiatric residential early parenting program for management of UIB. There were no exclusion criteria. Ethics approval was obtained from the Avenue Hospital Research Ethics Committee and Monash University Human Research Ethics Committee.

Measures

The FSS is a 9-item scale of fatigue severity and interference with functioning that uses response options ranging from 1 (“Strongly Disagree”) to 7 (“Strongly Agree”). FSS total scores range from 7 to 63, with scores ≥ 36 suggesting clinically elevated fatigue. The Fatigue Visual Analog Scale (FVAS; Insana and Montgomery-Downs 2010) is a single-item measure of fatigue. The question “How fatigued do you feel right now?” is rated between 0 (“Not at all”) and 100 (“Very Fatigued”) using a 10-cm line. The Depression Anxiety Stress Scale–depression subscale (DASS21-D; Lovibond and Lovibond 1995) and Edinburgh Postnatal Depression Scale (EPDS; Cox et al. 1987) are widely used measures of depressive symptoms. Pre-admission EPDS scores were from consenting participants’ medical records (n = 141). Alpha for the DASS21-D was 0.88 and for the EPDS was 0.81. Instrumental social support was assessed with the following item: How often do you feel that you need practical support or help but can’t get it from anyone? (0 = “Never”; 3 = “Very Often”).

Analysis

Rasch analysis (RA) was used to assess the unidimensionality, validity, and reliability of the FSS. The RA was conducted in Winsteps 3.92.1 (Linacre 2016) using a partial credit model. Published RA criteria for acceptable psychometric properties for fatigue scales were used (Lerdal et al. 2016; Mills et al. 2009; Ottonello et al. 2016; see Table 1). Convergent validity was assessed using correlations between the Rasch-generated person scores and FVAS, discriminant validity using the Rasch scores and DASS21-D, EPDS, and social support, all measures of constructs distinct to fatigue.

Table 1 Rasch analysis criteria and results for FSS and FSS-5R

Results

In total, 44% of women (N = 167) admitted to the MPHMBU unit between 1 June 2015 and 12 October 2015 participated on the day of admission. On average, participants were 34.26 years (SD = 4.23) and infants, 8.51 months (SD = 4.16, range = 2 to 23.50). Seventy percent of participants were born in Australia, 87% spoke English at home, 99% were married or living with a partner, 77% were university educated, and 35% reported past treatment for a psychiatric disorder. There was minimal (0.27%) missing data for FSS items, and the mean total FSS score was high at 47.92 (SD = 8.85, range = 21 to 63), with 89% of women scoring above the clinical cutoff. Mean scores were also elevated on the FVAS (M = 73.78, SD = 16.45, n = 129), DASS21-D (M = 5.12, SD = 3.81, n = 162), and EPDS (M = 10.87, SD = 4.66, n = 141); 50% of mothers reported at least mild depressive symptoms (DASS21-D ≥ 5).

The results of the RA are presented in Tables 1 and 2. Overall, the FSS items were not well targeted and scale items did not assess the latent variable of fatigue at either extreme (see Table 2). Mean Rasch scored person fatigue “ability” was substantially higher (> 1 logit) than the mean Rasch scored item “difficulty” of the FSS, showing an overall ceiling effect for the scale (see Table 1: step 7). Item response categories did not advance monotonically for items 3, 6, 7, and 9 (see Table 1), and there was a poor fit for item 2 and 7 (Infit > 1.3 MeanSquare [MnSq]; see Table 2). There were more participants (9%) with poor fit based on person goodness-of-fit statistics than expected by chance (≤ 5%). Reliability was adequate, and there was no differential item functioning for infant or maternal age.

Table 2 Rasch item fit statistics for FSS and FSS-5R

To improve the scale, the following procedures were carried out based on past FSS validations (Mills et al. 2009; Ottonello et al. 2016). The response options of the FSS were first revised from 1234567 to 1112345, and then items with ongoing lack of monotonic advancement (i.e., the increase of fatigue levels did not correspond to the increase in response options; item 9) or inadequate fit (Infit MnSq > 1.3; items 1, 2, 3) were removed in a stepwise fashion. The revised scale (FSS-5R) retained items 4 to 8 of the original FSS and performed better against the RA criteria (see Table 1), with adequate validity and reliability (α = 0.88), although still assessed only a modest range of fatigue difficulty (see Table 2). The correlation between the FSS-5R scores and the FVAS was 0.43 (p < 0.001). Between the FSS-5R and DASS21-D, EPDS and social support was 0.34, 0.29, and 0.19 respectively (all p < 0.05). For comparison, the correlation between the DASS21-D and EPDS was 0.69 (p < 0.001).

Discussion

This is the first study to evaluate the psychometric properties of the FSS during the postpartum period. The original FSS had several limitations among women with elevated fatigue seeking help for UIB, including an overall ceiling effect and poor fit for several items. It did demonstrate adequate reliability and a lack of item bias for different maternal and infant age categories. By revising response options and removing poorly performing items, the revised 5-item FSS (FSS-5R) had improved fit, reduced ceiling effect, adequate validity, and reliability.

Similar problems with the FSS have been found in populations with chronic illnesses (e.g., multiple sclerosis), where improvements in psychometric properties occurred following revision of response options and removal of items with poor fit (Mills et al. 2009; Ottonello et al. 2016). The modified FSS-5R retains items 4 to 8 of the original FSS and measures the interference of fatigue on specific domains of functioning. The removed items related to motivation (item 1), exercise (item 2), ease of fatigability (item 3), and interference of fatigue with work, family, and social life (item 9). Women in our study may have had difficulty simultaneously evaluating fatigue interference on these three different domains specified in item 9, which may be affected to different extents by their fatigue.

The revised FSS-5R demonstrated discriminant validity, by sharing only a moderate and small positive association with measures of depressive symptoms and a small association with social support needs. These correlations with the FSS-5R were weaker than the strong correlation observed between the two depressive symptom scales and the moderate correlation between the FSS-5R and FVAS. However, evidence for convergent validity of the FSS-5R and FVAS was modest. The lack of a stronger association may be due to the scales targeting different focuses (interference vs. symptoms), different assessment periods and item lengths.

The recruitment rate and sample size were modest but sufficient for item and person measures to be adequately stable. The study sample was predominantly married/partnered, well educated, and Australian born, which may limit generalizability to populations with different demographics. Further research is also required to assess FSS-5R for its fit among other postpartum populations (e.g., women not seeking support for UIB or fatigue, partners who are also vulnerable to fatigue), the longitudinal stability of its item hierarchy, and its sensitivity to treatment. Strengths of this study were use of a rigorous Rasch approach and recruitment of a help-seeking group of women, both firsts in a validation study of a postpartum fatigue scale. Findings from our sample are also likely to generalize to many other women in clinical settings seeking help with fatigue or other forms of psychological distress, given that as many as one in four families report difficulties with infant crying, fussing, and sleep difficulties (Fisher et al. 2011).

Overall, findings suggest that the original FSS has numerous psychometric problems in women with elevated postpartum fatigue symptoms. Clinicians and researchers measuring fatigue in postpartum populations should be cautious about using the FSS in its original form and are advised to consider using the briefer FSS-5R version with more robust psychometric properties that can be easily rescored from the original FSS.