Introduction

Chronic graft versus host disease (GVHD) is a well-documented, immune-mediated complication that occurs in 30–70 % of patients after hematopoietic cell transplantation (HCT) [1]. Chronic GVHD has traditionally been distinguished from acute GVHD based on its occurrence after day +100 post-HCT; however, persistent, recurrent, and late acute GVHD are subcategories of acute GVHD in which features of acute GVHD (maculopapular rash, gastrointestinal symptoms, transaminitis) occur beyond 100 days [2••]. Furthermore, the broad category of chronic GVHD includes both classic chronic GVHD and overlap syndrome, which occurs when features of both acute and classic chronic GVHD occur simultaneously [2••].

The incidence of chronic GVHD is increasing, likely due to older recipient age, expanded donor population, use of peripheral blood stem cells, and use of donor lymphocyte infusions [3]. Chronic GVHD is a leading cause of post-HCT mortality; it may necessitate treatment in an intensive care unit, which confers excess risk [4], and/or lead to prolonged illness and late death [5]. It has an even higher impact on morbidity, as it can lead to vision loss, end-stage lung disease, or severe infection secondary to prolonged immune suppression [2••]. Amongst allogeneic HCT recipients, patients with active chronic GVHD are at a significantly higher risk of life-threatening conditions versus those without chronic GVHD [6]. It can be inferred that with more comorbidities, these patients likely require more health care resources and utilize more health care dollars. As a result, chronic GVHD significantly compromises patients’ function and quality of life (QOL), a complex concept that encompasses physical, cognitive, emotional and social functioning, and well-being [3, 7].

The ultimate goal of HCT is cure of the underlying disease, with minimal to no detrimental impact on QOL, along with adequate hematopoietic function and immune reconstitution. Alloreactivity after HCT likely determines the complex interplay of GVHD, immune reconstitution, graft versus tumor effect, and graft function. Current clinical trial endpoints have focused on cumulative incidence of GVHD, non-relapse mortality, and overall survival but have not been able to successfully incorporate QOL into a composite endpoint (Fig. 1). In order to do so, we need to understand the impact of chronic GVHD on QOL. The focus of this article is to outline the studies that have addressed QOL after HCT with a specific focus on chronic GVHD and to address some of the future efforts that are needed in this field.

Fig. 1
figure 1

Response Assessment in GVHD. The relationship between frequently used endpoints in chronic GVHD studies and their relationship with overall HCT success. GVT graft versus tumor

An association between acute and chronic GVHD and worse QOL was first noted in the late 1990s among several retrospective and cross-sectional studies [811]. The initiative for specifically evaluating and quantifying QOL changes related to chronic GVHD began with the development of the Lee Chronic GVHD Symptom Scale in 2002 [12]. In 2006, Lee et al. reported the first longitudinal study to demonstrate that patients with both acute and chronic GVHD report worse QOL after HCT and recommended it be used as an important, measurable outcome [13]. The Bone Marrow Transplant Survivor Study had found chronic GVHD to be the most important predictor of late effects and worse overall health in HCT survivors [14]. That study was the first to document that outcomes of HCT survivors with resolved chronic GVHD were comparable to those without chronic GVHD, demonstrating an even greater need for improved therapies to combat chronic GVHD [15]. Finally, a review article published in 2009 by Pidala et al. clearly demonstrated negative associations between both acute and chronic GVHD and QOL [7].

In 2005, to address the unmet need, the National Institutes of Health organized a consensus conference for improving outcome assessment in chronic GVHD by establishing standardized definitions for diagnosis, severity scoring, response measures, and the conduct of clinical trials of chronic GVHD [2••, 1620, 21•]. In order to prospectively evaluate the proposed recommendations, the Chronic GVHD Consortium is conducting a multicenter prospective cohort study of patients with chronic GVHD. There are 11 participating centers in the Chronic GVHD Consortium (Fig. 2). Specific studies have been conducted with this cohort to evaluate the impact of chronic GVHD on patient-reported QOL and will be reviewed here [2229].

Fig. 2
figure 2

Participating Centers in the Chronic GVHD Consortium. Enrolling centers include Fred Hutchinson Cancer Research Center, Stanford University, University of Minnesota, Dana-Farber Cancer Institute, Vanderbilt University Medical Center, Medical College of Wisconsin, H. Lee Moffitt Cancer Center, Washington University, Memorial Sloan Kettering Cancer Center, Ann and Robert H. Lurie Children’s Hospital of Chicago

Chronic Graft Versus Host Disease Consortium Cohort

Patients were eligible for study participation if they were allogeneic HCT recipients age 2 years or older with a diagnosis of chronic GVHD (including overlap syndrome) and receiving systemic immunosuppressive therapy. Cases were defined as incident (study enrollment less than 3 months after chronic GVHD diagnosis) or prevalent (study enrollment three or more months after, but within 3 years of chronic GVHD diagnosis). Exclusion criteria included inability to comply with study procedures, primary disease relapse, or anticipated survival less than 6 months due to comorbid disease. Additional characteristics of this cohort have been previously described [18, 21•]. An important aspect of data collected within this cohort was the QOL assessments performed in conjunction with clinical data and standardized response criteria; patients were asked to report their symptoms, global severity scores, perception of disease activity and change, functional status, and quality of life using validated questionnaires (Table 1).

Table 1 Validated QOL instruments used in Chronic GVHD Consortium Studies

Impact of Chronic GVHD Severity on Quality of Life

Prior studies have shown the adverse impact of the presence of chronic GVHD on patients’ QOL [811, 13, 15, 36, 37]. However, by studying the Chronic GVHD Consortium cohort, Pidala et al. became the first to evaluate the impact of chronic GVHD severity, as defined by NIH criteria, on QOL [26]. Two hundred sixty of 298 patients (87 %) completed all or part of the FACT-BMT and the SF-36. In multivariate analysis, baseline GVHD severity at the time of study enrollment predicted levels of QOL scores (both composite and subscale). Age was also noted to be associated with QOL, specifically SF-36 physical functioning, which likely represents the notion that increasing age impacts physical abilities. Although there were few statistically significant differences in QOL scores between patients with mild versus moderate GVHD severity, there were significant differences observed between patients with moderate versus severe GVHD. QOL composite and subscale scores were lower on average for patients with severe GVHD when compared to patients with moderate GVHD; this was most evident on the SF-36 role-physical, FACT total, and FACT-BMT total scores.

Pidala et al. then compared PCS and MCS scores of patients with chronic GVHD to both population normative data as well as scores of patients with a variety of other chronic medical conditions. When compared to age- and gender-matched US population normative data, patients with chronic GVHD, regardless of severity, had significantly lower QOL scores for physical functioning, role-physical, bodily pain, general health, vitality, social functioning, and PCS. Patients with mild or moderate chronic GVHD had MCS scores comparable to both population normative and chronic medical condition data, indicating more preserved mental health; however, patients with severe chronic GVHD had MCS scores indicative of depression. Furthermore, patients with moderate chronic GVHD had mean PCS scores similar to patients with multiple sclerosis and diabetes, and those with severe chronic GVHD had mean PCS scores similar to individuals suffering from systemic lupus erythematosus and myocardial infarction. These results indicate that chronic GVHD severity is significantly associated with QOL, independent of other demographic, disease, and transplant-related factors. This relationship was present among multiple QOL domains, demonstrating a broad spectrum of impairment by chronic GVHD severity.

Inamoto et al. collected similar data to evaluate whether not just the presence of chronic GVHD, but changes over time in the NIH-proposed objective response measures were associated with symptom burden and QOL [28]. Patients completed the SF-36, FACT-BMT, HAP, Lee Chronic GVHD Symptom Scale, and a 10-point scale for peak symptom severity during the previous week [38]. Clinical responses were calculated utilizing the provisional response algorithm [18] as complete response, partial response, stable disease, or progressive disease for both individual organ systems and overall, at enrollment and at the 6-month follow-up visit. Of the 283 patients included, 150 (53 %) were incident cases and 133 (47 %) were prevalent cases. Surprisingly, there was no association found between overall response and QOL scores in either incident or prevalent cases. Clinical response, both overall and for individual organ systems, at 6 months correlated with patient-reported symptom burden for incident cases, but not for prevalent cases. The authors hypothesize that this effect may be due to either that symptoms are more easily treated early after chronic GVHD diagnosis or that symptom changes are less noticeable in patients who have had a prolonged chronic GVHD course. Another interesting finding in the current study was that type of systemic GVHD treatment was not associated with changes in symptom or QOL scores at the time of enrollment but was at the 6-month follow-up visit. Use of prednisone at 6 months was associated with higher symptom burden and lower QOL. Furthermore, daily prednisone compared to less frequent dosing at 6 months was associated with higher symptom burden on the Lee Chronic GVHD Symptom Scale (p = 0.039) and 10-point overall symptom scale (p = 0.022), lower QOL on the SF-36 PCS (p = 0.0017) and FACT-BMT (p = 0.0091), and worse HAP maximum activity score (p = 0.005). Use of calcineurin inhibitors at 6 months, though, was not associated with symptom or QOL scores.

In contrast, a previous study by Pidala et al. evaluating the Chronic GVHD Consortium cohort reported poor correlation between changes in NIH-proposed chronic GVHD severity scores and QOL measures [29]. Global severity scores, according to the NIH Consensus criteria and scoring algorithm, were collected, as well as an independent assessment of severity by both patients and clinicians using “none,” “mild,” “moderate,” and “severe” without specific definitions for 336 patients, and scores were compared to those from the prior visit. Although this study was limited by a substantial proportion of missing data for QOL assessments, it still brings to light notable findings. In multivariate analyses, using chronic GVHD severity change as either a continuous or categorical variable, no association was found with changes in QOL as assessed by the SF-36 and FACT-BMT. However, there were significant associations noted between clinician-reported changes in chronic GVHD severity and change in QOL on the FACT-G (p = 0.002) and FACT-BMT (p = 0.004), especially as GVHD severity decreased, but not on the FACT-TOI or SF-36. There were also significant associations found between patient-reported changes in chronic GVHD severity and change in QOL on all assessments (p < 0.001 for SF-36 PCS, SF-36 MCS, FACT-TOI, FACT-G; p = 0.000 for FACT-BMT). This data shows that QOL information must be ascertained directly; it cannot be inferred from clinician-reported scores or chronic GVHD severity ratings.

Pidala et al. found baseline chronic GVHD severity level at the time of study enrollment to be significantly associated with multiple QOL domains, independent of other demographic, disease, and transplant-related factors [26]. However, he also discovered that changes in chronic GVHD severity, when evaluated at follow-up visits and compared to the prior visit, were not associated with significant changes in patient-reported QOL [29]. As previously suggested, patients’ QOL is affected by many factors, such as effects from the underlying disease, toxicities from prior therapies, and permanent deficits from chronic GVHD [28]. Therefore, it remains a controversy that even when chronic GVHD resolves, a notable change in QOL may not occur [7]. Longitudinal assessments are needed to evaluate how QOL is affected as GVHD status changes and determine whether patients experience a prolonged impairment of QOL despite clinical improvement.

Impact of Chronic GVHD Subtypes on Quality of Life

There is data to support that overlap syndrome is associated with worse prognosis and inferior outcomes when compared to classic chronic GVHD [39]. Pidala et al. found that patients with overlap syndrome have worse functional impairment and some degree of lower QOL [24]. The study evaluated 427 patients, 352 (82 %) with overlap syndrome, and 75 (18 %) with classic chronic GVHD. Those with overlap syndrome were more likely to be incident cases, with a shorter time from HCT to enrollment. They had significantly higher degrees of functional impairment, as measured by poorer performance on the 2-min walk test (distance in feet 495 versus 540; p < 0.001) and lower HAP scores (for example, maximum activity score 70 versus 78; p < 0.001) when compared to those with classic chronic GVHD. Patient-reported symptom burden was also higher amongst those with overlap syndrome versus those with classic chronic GVHD. Patients with overlap syndrome reported worse social functioning on the SF-36 (median score 40.5 versus 45.9; p = 0.01), although other QOL aspects were similar between the two groups.

Of note, patients in the current study with overlap syndrome also had lower overall survival and higher non-relapse mortality rates. Because the incidence rates of prior acute GVHD were similar between the two groups, Pidala et al. suspected that this functional impairment is related more to the chronic GVHD component of disease than to the acute GVHD manifestations or the prolonged immune suppression required for its treatment. However, Inamoto et al. reported that frequent use of prednisone was associated with worse QOL, symptom, and activity scores [28]. One may infer that more frequent prednisone dosing is required for higher disease severity, so it is likely that the disease is negatively impacting QOL. However, chronic GVHD is a prolonged illness, which necessitates systemic immune suppression for a median time of 2 to 3 years before tolerance occurs [40]. The side effects from higher steroid doses may actually be causing increased symptoms beyond physicians’ perceptions.

Impact of Site-Specific Chronic GVHD on Quality of Life

Gastrointestinal Chronic GVHD

Regarding gastrointestinal (GI) involvement by chronic GVHD, the NIH criteria grade severity on a scale 0–3 by degree of weight loss and magnitude of elevation of lab values for GI and hepatic manifestations, respectively [2••]. Pidala et al. examined whether site of GI and/or type of hepatic involvement is associated with overall survival, nonrelapse mortality, symptoms, QOL, and functional status in 567 patients [23]. Site of GI involvement was divided into none, esophageal, upper GI, and lower GI, as well as if it occurred alone or in combination, and type of hepatic involvement was classified as none, elevation of bilirubin, alkaline phosphatase, and/or alanine aminotransferase (ALT) over the upper limit of normal. The authors found a relationship between clinician-reported site of GI involvement, but not type of hepatic involvement, with patient-reported symptom burden, as reported on the Lee Chronic GVHD Symptom Scale. Overall GI severity score and elevated bilirubin was associated with patient-reported QOL and can likely be used as markers in clinical practice to improve or maintain QOL in affected patients; however, distinguishing between upper and lower GI, liver severity score, and other hepatic measures (alkaline phosphatase, ALT) were not consistently associated with QOL. Further studies are needed to definitively determine which objective measures and assessments result in significant changes in QOL so that affected patients may be recognized and treated earlier.

Joint and Fascia Chronic GVHD

Features of chronic GVHD joint and fascia involvement include edema, joint stiffness or restricted range of motion (ROM), contractures, and rarely, arthralgia and arthritis [2••]. Though they occur infrequently, the features can be significant and likely impact physical fitness and contribute to lower QOL. Inamoto et al. evaluated three joint assessment scales as well as 10 symptom, QOL, and physical function scales to determine the optimal means of identifying changes in joint and fascia manifestations of chronic GVHD in 567 patients followed for a mean duration of 23.6 months [22]. Joint and fascia manifestations were present at the time of study enrollment in 164 (29 %) of patients. Those with joint and fascia manifestations had a higher symptom burden and lower QOL as indicated by lower scores on the FACT-G (median score 76 versus 81; p = 0.003) and the SF-36 PCS (median score 37 versus 40; p = 0.002) versus those without. However, the authors concluded that neither the SF-36 nor the FACT-G are completely adequate for capturing QOL changes associated with joint and fascia manifestations of chronic GVHD, as the SF-36 was sensitive only to clinical improvement, and the FACT-G was sensitive only to clinical worsening. Patients with joint and fascia manifestations also had more frequent skin involvement and skin sclerosis, as well as a higher NIH global severity score, which may also contribute to inferior QOL.

Impact of Exercise Tolerance and Muscle Strength on Quality of Life

Measures of exercise tolerance and voluntary muscle strength have been used in several clinical settings to diagnose functional impairment, monitor changes in ability over time and/or with therapeutic interventions, and gauge prognosis. However, there is little information regarding the utility of the 2-min walk test (2MWT) in post-HCT patients [41] and no prior data for hand grip strength (HGS) in this population. Pidala et al. studied the relationship of the 2MWT and HGS, in 584 patients of the Chronic GVHD Consortium cohort, with chronic GVHD severity and response, overall mortality, and patient-reported measures [25]. Significant associations were found between shorter 2MWT and higher symptom burden (Lee Chronic GVHD Symptom Scale for overall, skin, lung, and energy categories), more impaired QOL (SF-36 PCS, physical functioning, role functioning-physical, general health, and vitality, as well as FACT scores), and functional disability (HAP scores). Similarly, though to a lesser extent, lower HGS was associated with more impaired QOL (SF-36 physical component score summary score, general health, and FACT-BMT scores) and functional disability (HAP adjusted activity score). The authors postulate that the impaired performance of patients with chronic GVHD is likely due to a combination of decreased cardio-pulmonary fitness, poorer function of chronic GVHD target organs, and effects of immunosuppression (muscle weakness, atrophy, and/or dependent edema). Regardless of cause, functional impairment, as measured by the 2MWT and HGS, negatively impacts patients’ QOL.

Impact of Age on Quality of Life

El-Jawahri et al. were the first to evaluate differences in QOL, symptom burden, and functional ability between patients with chronic GVHD in different age groups [27]. Five hundred twenty-two patients were divided into three age groups at the time of enrollment: adolescent and young adult (AYA; 18–40 years), middle aged (41–59 years), and older (≥60 years). The AYA group contained 115 patients (22 %), the middle-aged group had 279 patients (53 %), and the older group had 128 patients (25 %); all patients had either moderate (58 %) or severe (42 %) chronic GVHD. Of note, the older age group was more likely to have had reduced intensity conditioning (RIC), peripheral blood as the graft source, and a higher comorbidity burden. Overall symptom burden, as measured by the Lee Chronic GVHD Symptom Scale, was comparable among all age groups, but subscale analysis revealed that older patients experienced a lower psychological symptom burden than AYA and middle-aged patients (median score for older 16.7, middle aged 25.0, AYA 25.0; p = 0.001), indicating that they cope well with their limitations and preserve a reasonable QOL. Also, older patients demonstrated more preserved QOL when compared to middle-aged and AYA patients, as measured by the FACT-BMT (median score for older 109, middle aged 102; AYA 106; p = 0.01), despite having higher physical limitations and more functional impairment, as measured by the HAP and 2MWT. After adjusting for demographic, disease, and transplant-related factors, there was a U-shaped relationship between age and QOL found; older and AYA patients had similar FACT-BMT scores, while middle aged patients scored approximately 5.7 points lower than both groups. SF-36 PCS and MCS were similar across all age ranges. While AYA patients had less physical limitations, middle aged patients had similar limitations to older patients but still reported lower QOL scores.

These findings are consistent with a recent publication evaluating QOL after allogeneic HCT, which found older patients to have similar overall QOL and higher social well-being scores when compared to younger patients [42]. It has also been documented that even when older patients experience chronic GVHD and reported symptoms such as fatigue, dyspnea, insomnia, and appetite loss, they still rate their global QOL as good-to-excellent [43]. Therefore, age does not seem to have an independent effect on QOL in patients with moderate and severe chronic GVHD. Older patients seem to cope well with their resulting limitations and maintain an acceptable QOL, supporting the notion that advanced age should not be a barrier to consideration of HCT. Additionally, middle aged patients may require additional counseling and education to ensure that their expectations of potential adverse effects associated with HCT are realistic.

Conclusions

The ongoing work of the Chronic GVHD Consortium in the area of chronic GVHD and QOL has provided the field with valuable information (Table 2), but there is still a significant amount of work to be done. Unfortunately, studies evaluating chronic GVHD are often fraught with limitations beyond small sample size.

Table 2 Summary of Chronic GVHD Consortium QOL studies

Evaluation of QOL using PROs brings to light additional limitations that must be addressed in future studies. For example, GVHD-specific QOL metrics are needed. The Lee Chronic GVHD Symptom Scale was specifically developed for this purpose, but the other questionnaires frequently used in chronic GVHD trials, including the ones discussed here, were not. Another downfall of using PROs in data collection is missing data (such as incomplete patient-reported surveys). With small sample sizes, it is imperative that all data be collected completely to maximize generalizability of results. Inamoto et al. noted that their study was limited by a substantial portion of missing data on QOL measures, especially as time from study enrollment increased [28]. As several questionnaires were utilized in each of the studies discussed here, the data burden on patients and providers is substantial. When patients experience survey fatigue, the completeness and reliability of the questionnaires are diminished. Inamoto et al. determined that the SF-36 and FACT-BMT questionnaires were fairly good indicators of patient perspectives and physicians’ evaluation, although neither correlated well with changes in NIH severity scores [28]. Furthermore, based on their findings, they also postulate that the forms could be condensed to only the FACT-G to decrease redundancy of questions and reduce the time commitment of paperwork without losing valuable data [28]. Additional studies comparing QOL surveys and application of their results to other chronic GVHD measures will facilitate more efficient data collection. Deciphering which questions best discriminate between the presence or absence of chronic GVHD and compiling those into a single, concise, reliable patient-reported QOL tool is necessary.

Another aspect in need of additional attention is the discrepancy between changes in clinical assessments and changes in patient-reported QOL. As reported by Inamoto et al. for joint and fascia manifestations, clinician and patient-perceived clinical changes do not always correlate with a change in reported QOL [22]. For other chronic GVHD manifestations, however, clinical assessment and objective laboratory data are associated with patient-reported symptom burden and QOL [23]. Therefore, a better understanding of which clinical changes affect PROs and other clinical endpoints is essential for improving targeted therapies in chronic GVHD.

The conclusions drawn thus far usher in additional questions to be answered and areas of impact to be explored. Longitudinal assessments such as the studies discussed here by the chronic GVHD Consortium are necessary to increase our knowledge on the long-term effects of chronic GVHD, duration of impairment, and predictors of recovery/worsening. The impact of chronic GVHD on QOL needs to be measured, and the tools must be able to discriminate from other coexisting problems that impact QOL but are not related to chronic GVHD. Wood et al. recently introduced using the concept of survival without progressive impairment as an endpoint for chronic GVHD clinical trials [44]. These endpoints need to be validated in independent cohorts before they are deemed acceptable.

In order to make meaningful progress in chronic GVHD management, we need targeted therapeutic agents that are approved by the Food and Drug Administration. In order to achieve that, the transplant community needs to systematically evaluate various QOL endpoints and identify interventions that can fulfill patients’ ultimate goal post-HCT—to live longer and live better.