Introduction

Primary biliary cirrhosis (PBC) is a chronic cholestatic liver disease of unknown etiology. Bile duct injury from portal and periportal inflammation often results in progressive fibrosis and eventual cirrhosis [1]. However, one of the most disabling symptoms in PBC is fatigue. The symptom of fatigue is often described as a perception of exhaustion resulting in a reduction of physical and mental capacity. This often affects the performance of daily activities and is a significant contributor to impaired health-related quality of life in these patients.

Natural history studies over the past 25 years have reported the presence of fatigue in 0–12% of patients with PBC [17]. In contrast, a significantly higher frequency (44–76%) has been reported from the large, randomized controlled trials of ursodeoxycholic acid designed to halt disease progression [812]. Referral bias and the systematic evaluation to document fatigue in these trials are likely responsible for this discrepancy in prevalence. For asymptomatic patients at diagnosis, the cumulative risk for developing fatigue over a 5- to 10-year period is substantial, at 44–56% [13].

With the recognition of fatigue as a major determinant of poor health status in PBC, a number of recent investigations have used multi-item validated questionnaires to improve fatigue detection. Prevalence rates between 60% and 80% among patients with predominantly early-stage disease have been reported [1417]. Further investigation reveals that fatigue remains independent of hepatic disease severity, sleep disturbance, or depression [1416].

Identifying effective medical treatment specific to fatigue in PBC has received little attention until recently. The empirical use of antioxidant therapy had no effect on fatigue scores in a randomized, placebo-controlled crossover trial [18], despite positive results from an uncontrolled study [19]. Improved fatigue severity was not associated with the use of ondansetron (a 5HT1 A receptor antagonist) versus placebo [20] in a cross-over study.

In concert with the hypothesis of abnormal serotonin neurotransmission, we sought to determine the safety and efficacy of fluoxetine (a selective serotonin reuptake inhibitor; SSRI) for the treatment of fatigue in patients with PBC. The choice of fluoxetine was based on the existence of long-term safety data and documented positive treatment effects for other fatigue-related disorders.

Methods

Patient population

Individuals with PBC between 18 and 75 years of age were eligible for study enrollment. In addition, a verbal report of fatigue for at least six months duration was required. Diagnostic criteria for PBC included the following: (1) cholestatic serum liver biochemistry abnormalities for ≥6 months, with a serum alkaline phosphatase level ≥1.5 times the upper limit of normal; (2) a serum antimitochondrial antibody titer ≥1:40 or 1.0 by immunofluorescence measurement; (3) the absence of biliary obstruction by cross-sectional imaging; and (4) a previous liver biopsy diagnostic or compatible with PBC. The ability to provide written informed consent was also considered an inclusion criterion.

Exclusion criteria included the following: (1) a known medical condition or metabolic disorder sufficient to explain fatigue; (2) current or a history of clinical depression; (3) recent treatment with fluoxetine hydrochloride, other SSRI agents, tricyclic antidepressants, monoamine oxidase inhibitors, and benzodiazepine agents ≤3 months prior to study enrollment; (4) a known hypersensitivity to fluoxetine hydrochloride; (5) the anticipated need for liver transplantation referral from decompensated liver disease; (6) pregnancy or current lactation; and (7) the inability or unwillingness to practice contraceptive measures for the prevention of pregnancy. The study was approved by the Mayo Foundation Institutional Review Board and conducted according to principles contained within the Declaration of Helsinki.

Study design

Randomization

Sequence generation was performed from a centralized pharmacy location by a clinical trials pharmacist using a randomization schedule in blocks of four patients. Individuals were randomized to placebo or fluoxetine tablets at 20 mg to be taken every morning for an 8-week duration. The placebo tablets were similar in color and design to facilitate blinding. Individuals were counseled to take the medication prior to meals. The allocation sequence and treatment assignment were concealed by a clinical trials pharmacist. The codes were kept in the pharmacy until the study’s conclusion. Enrollment of study participants was performed by experienced nurse study coordinators (R.A.J., J.C.P.). The clinical trials pharmacist, study coordinators, clinicians, and patients were unaware of treatment assignment during the study.

Interventions

A complete history and physical examination were performed on each patient prior to study entry. Symptoms of pruritus and keratoconjunctivitis sicca (dry eyes and/or dry mouth) were abstracted from histories, given their potential contribution to fatigue. Serum hepatic biochemistry tests including alkaline phosphatase, aspartate aminotransferase (AST), total bilirubin, albumin, and prothrombin time were performed. Patients were referred for cross-sectional imaging with abdominal ultrasound to exclude biliary obstruction and indirect features of portal hypertension (ascites, splenomegaly, intra-abdominal varices) if not performed within 1 year of study enrollment.

To exclude the presence of subclinical depression, the Center for Epidemiological Studies—Depression Scale (CES-D) [21] was administered to all eligible patients before randomization. This self-reporting, 20-item instrument has been validated as a measure for assessing depression in clinical research. A score of ≥16 is highly associated with clinical depression requiring formal psychiatric evaluation. This also represented a study exclusion criterion. Sensitivity for the CES-D in detecting subclinical depression among patients with PBC has been previously demonstrated [14].

Baseline fatigue severity was determined in all randomized patients with the Fisk Fatigue Impact Scale (FFIS) [22]. The FFIS is a 40-item questionnaire, which employees a Likert scale for each question on a 0–4 rating scale. A range of scores between 0 and 160 points is possible with the FFIS. Domains represented by the FFIS include physical, cognitive, and psychosocial. Reliability and construct validity for the FFIS have been established in cross-sectional studies of patients with PBC [1417]. The average time to completion for the FFIS is between 5 and 10 min.

Health-related quality-of-life assessment was performed with the Chronic Liver Disease Questionnaire (CLDQ) [23]. The CLDQ was developed as an evaluative instrument to measure longitudinal change in health status within individuals with chronic liver disease. A total of 29 items with specific response formats (Likert scale) ranging from worst (1) to best (7) function are included. Six domains including (1) abdominal symptoms, (2) systemic symptoms, (3) activity, (4) emotional function, (5) fatigue, and (6) worry. Scoring of the CLDQ was performed by dividing each domain score by the number of items per domain. The average time to completion is 10 min. Reliability and construct validity for the CLDQ in patients with PBC have been demonstrated in cross-sectional investigations to date [2325]. The use of disease-specific instruments for evaluating HRQL in clinical trials is recommended, as they may be more responsive to change than generic questionnaires.

Patients were also monitored for potential adverse events related to therapy. Serum liver biochemistries were performed every 4 weeks through mailed samples or patient visits for the 8-week treatment period. For stable elevations in serum biochemical parameters (defined by a two- to threefold elevation in serum alkaline phosphatase, AST, or total bilirubin), repeated blood samples were to be obtained every 2 weeks throughout the 8-week treatment period. Pronounced worsening of serum liver biochemistries (defined as a fourfold elevation in any of the serum hepatic biochemistries) would result in discontinuation of therapy. Intractable pruritus and fatigue, nausea and/or vomiting, severe diarrhea, and the need for liver transplantation referral were also indications for treatment discontinuation. Patients were notified to contact study nurse coordinators with the development of any possible symptoms related to therapy.

Outcomes

The primary study outcome was defined as a ≥50% reduction in fatigue severity (quantified by the FFIS) following 8 weeks of treatment, compared to baseline values. Secondary endpoints included (1) the frequency of adverse events in each treatment arm; (2) change in serum alkaline phosphatase, AST, total bilirubin, and albumin levels after 8 weeks of therapy compared to baseline values; (3) change in health-related quality of life measured by the overall CLDQ score; (4) change in CLDQ fatigue domain score; and (5) change in CES-D score from end of treatment compared to baseline.

Power and sample size

For power and sample size estimates, it was assumed that 50% of individuals enrolled in the study receiving active treatment would achieve the primary endpoint. Among placebo-treated patients, it was estimated that 10% of individuals would achieve the primary endpoint. Based on a power of 80%, and a significance level of 0.05, a required enrollment of 36 patients (18 in fluoxetine arm, 18 in placebo arm) was estimated. Given a possible 20% dropout rate based on adverse events and other trial-related issues, a sample size of 40 patients was required.

Statistical analyses

Continuous variables (reported as medians with interquartile ranges) were analyzed using the Wilcoxon signed rank test, given the nonparametric nature of the data. Chi-square or Fisher’s exact test was used to compare end-of-treatment rates with baseline values for categorical data. Associations between FFIS scores and relevant variables were reported using the Spearman correlation coefficient method. A P value of ≤0.05 was considered statistically significant. Interim analyses were not performed given the short duration of therapy. Missing data were excluded but all patients were accounted for in the primary endpoint analysis.

Results

Participant flow

Patients who were potentially eligible for this study were approached exclusively in the outpatient clinic setting (Fig. 1). The majority of these individuals (≥90%) were evaluated as part of ongoing clinical care for established disease or to confirm a diagnosis of PBC following the conduct of investigations elsewhere. Over a 30-month period, 220 patients with PBC were asked about the presence or absence of fatigue (i.e., “Have you been bothered by fatigue in the last six months?”). Utilizing this approach, 103 patients (47%) reported the presence of fatigue to their health-care providers. However, only 20 (9%) individuals agreed to study enrollment. Reasons for nonparticipation included mild symptom severity not warranting therapy, eligibility for other clinical trials in PBC, and disinterest with fluoxetine as the study therapy. Two (10%) individuals were not randomized based on scores ≥16 points on the CES-D questionnaire. These patients were offered further evaluation and psychiatric consultation.

Fig. 1
figure 1

Flow diagram of patient progress through phases of the randomized trial

Table 1 Baseline characteristics

All 18 individuals were subsequently randomized and received treatment. Ten patients were randomized to fluoxetine, while eight patients were assigned to placebo treatment. Fourteen (78%) individuals completed therapy. Eleven (61%) patients completed the entire study protocol. All 14 patients were used to analyze the primary outcome.

Baseline variables

Baseline variables for both treatment arms are listed in Table 1. Demographics, clinical features of PBC, and baseline questionnaire scores (FFIS, CLDQ, CES-D) were similar between groups. Stage 1–2 histologic disease was observed in 14 of 18 (78%) patients. None of the patients with symptoms from keratoconjunctivitis sicca or pruritus described these as severe in grade.

Primary outcome

Median baseline FFIS scores for fluoxetine and placebo arms were 52 and 41, respectively. After 8 weeks of therapy, the median FFIS scores for fluoxetine and placebo arms were 51 and 28, respectively. No significant difference between end-of-treatment FFIS scores was observed (P=0.42). The reduction in median FFIS score (corresponding to improvement in symptoms) observed among placebo-treated patients (42 to 28) was not statistically significant (P > 0.05). The treatment arms did not differ significantly in the percentage of subjects who recorded improvement in median FFIS scores while receiving treatment. Only one patient in each treatment group achieved a ≥50% reduction in FFIS score correlating with symptom improvement. No significant difference in median FFIS physical, cognitive, and psychosocial domain scores of the between treatment arms was observed (data not reported).

Secondary outcomes

End-of-treatment CLDQ scores were available for 11 (61%) of the 18 patients (5 in fluoxetine arm, 6 in placebo arm). The median overall CLDQ scores in fluoxetine- and placebo-treated patients were similar (5.04 vs. 5.41; P=0.65) and no different from baseline values. End-of-treatment CES-D scores were available for 9 (50%) of the 18 patients (5 in fluoxetine arm, 4 in placebo arm). The median CES-D scores in fluoxetine- and placebo-treated patients were similar (5 vs. 5.5; P=0.96) and no different from baseline values. No significant change in serum hepatic biochemistry levels was observed in either treatment arm during the 8-week treatment period (data not shown).

Adverse events

Worsening fatigue (n=3) was the most common adverse event reported. Other side effects reported following treatment initiation include dizziness (n=1), somnolence (n=1), nausea (n=1), and vomiting (n=1). Four patients withdrew from the study before completing the treatment period. Drug-related adverse events were the cause for study withdrawal in three patients (worsening fatigue, dizziness, nausea/vomiting). All of these individuals were ultimately found to be on fluoxetine at the conclusion of the study. Health insurance limitations were cited for one individual as the reason for study dropout.

Discussion

In this randomized, double-blind, placebo-controlled trial, the use of fluoxetine was not associated with improvement in fatigue severity among patients with PBC. Potential confounding factors including hepatic disease severity, frequency of pruritus and keratoconjunctivitis sicca, and subclinical depression did not appear to influence results given their balanced distributions following randomization. In addition, drug-related adverse events necessitated study withdrawal in three patients, all of whom were given fluoxetine during the treatment period. Only one individual in each treatment arm achieved a ≥50% reduction in fatigue severity as quantified by the FFIS. No change in HRQL was observed following treatment either.

The prevalence of fatigue in PBC has been inconsistently reported. This may be related to the insensitivity of questions about fatigue where responses are elicited in a dichotomous (yes/no) fashion. Improvement in fatigue detection and quantification of its severity has occurred with the use of validated multi-item questionnaires such as the FFIS. The FFIS is particularly useful in this regard, given its recognition of physical, mental, and psychosocial domains contributing to fatigue. While performance of the FFIS has been studied in cross-sectional settings, data on its ability for assessing treatment response remain limited. In their study of oral antioxidant therapy, Prince and colleagues failed to demonstrate any statistical difference in overall and domain-specific FFIS score between active and placebo-treated groups [18]. Similar observations were noted in this study.

Altered central neurotransmission is one of the leading hypotheses to explain the development of fatigue in patients with cholestasis including PBC. Both serotonergic and noradrenaline pathways have been implicated [26]. The recognition of defective central serotonin neural activity is also observed with nonhepatic disorders such as chronic fatigue syndrome [27]. In a rat model of cholestasis, the administration of serotonin receptor agonists, however, was associated with increased overall locomotor activity scores compared to bile duct-resected controls. In clinical settings, an isolated case report [28] has documented improvement in fatigue with ondansetron (a 5HT1 A receptor antagonist) from a single patient with chronic hepatitis C. In a multicenter double-blind, randomized crossover trial among individuals with PBC [20], the use of ondansetron was not associated with reduced fatigue severity assessed by the FFIS. Results of this investigation, however, may not be valid, given the absence of reported concealed allocation and the use of per-protocol rather than intent-to-treat analyses.

Several alternate theories have emerged to explain the development of fatigue in patients with PBC. Impaired corticotropin-releasing hormone (CRH) release or central activity may be responsible for the development of fatigue in patients with PBC. Abnormal hypothalamic-pituitary-adrenal (HPA) axis function has been documented in patients with rheumatoid arthritis [29] and multiple sclerosis [30]. Swain and colleagues have demonstrated a decrease in hypothalamic CRH concentrations and diminished in vitro release of CRH from hypothalamic explants of rats with obstructive cholestasis [31]. Furthermore, HPA axis responsiveness is diminished following injection of CRH in cholestatic rats [32, 33]. Burak and colleagues also demonstrated increased sensitivity to locomotor activating effects following central CRH infusion [34]. Similar pathophysiologic processes are hypothesized to occur in PBC-associated fatigue which could be further elucidated with HPA axis testing. A number of behavioral disturbances including fatigue have been linked to elevated serum cytokine levels including interleukin (IL)-6 and IL-1β [26]. Infusion of IL-1β into cholestatic rodents was noted for increases in lethargy and fatigue, suggesting that liver disease may enhance the central effects of this cytokine [35]. Similar investigations or results have not been observed in humans with PBC.

The conduct of this investigation raised a number of questions. The projected study enrollment was not achieved based on the inability to prospectively recognize a discordance between verbal report and FFIS quantification of fatigue severity in our population. A number of patients with fatigue also deferred study enrollment based on perceived minimal symptoms. A systematic examination of FFIS scores in asymptomatic patients and those with self-reported fatigue who refused participation is currently ongoing. At first glance, the absence of difference in fatigue severity between study arms may be ascribed to reduced power (type 2 error). However, the lower median FFIS score following placebo therapy appears to reinforce the limited efficacy of fluoxetine. The ability to detect small change may be limited when using the FFIS questionnaire to determine the benefits of a therapeutic intervention. Identifying what constitutes a minimal but important clinical difference for clinical trials in fatigue requires further study. Although no significant difference in end-of-treatment CES-D scores were observed between groups, a subclinical effect of fluoxetine on depression resulting in lower FFIS scores among placebo-treated patients may not have been completely excluded.

Future studies should continue to focus on neuroendocrine alterations including central serotonergic transmission and HPA axis dysregulation. Further refinements in fatigue assessment with a focus on developing objective measures that correlate with self- perceived symptoms is a major priority. Ultimately, a consensus definition of what constitutes a minimally important clinical difference following the use of novel therapies for the treatment of fatigue is required. Multicenter collaborations should also be performed, given the likelihood that both geographic and cultural influences affect the reported prevalence and severity of fatigue.