Introduction

The National Institutes of Health (NIH) initiated the patient-reported outcomes measurement information system® (PROMIS®) project in 2004 with the goal to provide researchers and clinicians access to standardized, psychometrically robust, patient-reported measures (pediatric and adult) of key symptom and function domains. A central principal is that these measures could be useful across a broad range of conditions and diseases to capture the impact of the disease and treatment on the lives of the patients. The non-disease specific measures of the PROMIS measurement system allow the comparison of scores from one group of individuals to another to evaluate relative disease burden. Given the broad scope of the PROMIS measures, it is critical to evaluate the psychometric properties in multiple disease populations to provide evidence that the measures are valid and reliable assessments of the symptom and function domains they measure.

The PROMIS Pediatric measures are designed for children and adolescents between 8 and 17 years of age. Extensive qualitative and quantitative methods have been used to design and evaluate, in multiple diverse patient groups, the PROMIS measures of: physical function—mobility [1], physical function—upper extremity [1], pain interference [2], fatigue [3], depressive symptoms [4], anxiety [4], anger [5], and peer relationships [6]. In addition, a multi-site study collected data from 1447 children and adolescents with chronic health conditions, including sickle cell disease, kidney disease, cancer, rehabilitative needs, obesity, and rheumatic disease, in order to validate the PROMIS Pediatric measures in cross-sectional studies [7,8,9,10,11].

It is also critical to evaluate the responsiveness of the PROMIS Pediatric measures over time and changing clinical situations. Responsiveness is an aspect of validity that indicates the ability of a PRO measure to detect change over time when it is expected. For PROMIS Pediatric measures to be adopted for use in clinical trials, there must be evidence of their responsiveness to inform the efficacy evaluation of the intervention or the disease treatment impact under study. The goal of this study was to evaluate the responsiveness of eight PROMIS Pediatric measures in diverse samples of children and adolescents with cancer, sickle cell disease (SCD), or nephrotic syndrome (NS).

Methods

Participants and study design

The University of North Carolina (UNC) served as the central coordinating center to support the following sites focused on specific diseases: Children’s National Health System—Cancer; University of Michigan—Nephrotic Syndrome, and Emory University—Sickle Cell Disease. The study reported here represents a secondary analysis of three separate studies of responsiveness of the PROMIS Pediatric measures at their respective sites. Each site had the freedom to select the number of assessment points, timing of the assessment points, sample size, and PROMIS domains; thus, there is heterogeneity across the three diseases because of the differential experience of symptom burden and functional status by disease. However, each had a common goal to assess responsiveness over time of the PROMIS Pediatric measures for children experiencing changing health status.

Common eligibility criteria across all diseases included children and adolescents between the ages of 8 and 17 years of age, ability to read and speak english (because at the time of the study, there were no translations of the PROMIS Pediatric measures), functional computer skills (defined as the ability to see and interact with a computer screen, keyboard, and mouse), and willingness to give written assent/permission for study participation. Excluded were children and adolescents who had any concurrent medical or psychiatric condition that precluded study participation, or cognitive or other (e.g., visual) impairments that interfered with completing a self-administered, computer-based questionnaire. Additional eligibility criteria specific to a disease are provided below. All sites received approval from their respective Institutional Review Boards.

Cancer

Eligible children and adolescents were diagnosed with a childhood cancer, scheduled to receive a course of anti-neoplastic (not biologic agent only) chemotherapy from course 2 forward, and not currently enrolled on a Phase 1 clinical trial. The study included three time points. Time 1 (T1) occurred 1–2 days before an early course of chemotherapy. Time 2 (T2) occurred 7–16 days following chemotherapy initiation at the time when the patient’s nadir was projected. T3 occurred within 1–2 days preceding the next course of scheduled chemotherapy or approximately 2 weeks following T2. More study details are provided elsewhere [12].

Based on clinical experience, we hypothesized that PROMIS physical symptoms and function scores at T1 would be within the normal range; however, children would have elevated emotional distress (depression and anxiety) given the proximity of the assessment relative to the cancer diagnosis and start-up of chemotherapy. Following chemotherapy and at the projected time of the nadir (T2), we hypothesized physical symptom and function scores would be worse compared to T1 (except depression and anxiety), and that the symptom and function scores would be improved at T3 compared to T2. The child’s relationship with peers was not expected to change due to the relatively short treatment cycles.

Nephrotic syndrome

Eligible children and adolescents included those with active nephrotic syndrome defined as the presence of nephrotic range proteinuria (≥2 + urinalysis and edema or urine protein/creatinine ratio >2 g/g) at the time of the first PROMIS assessment. Participants were recruited from 14 academic medical centers across the US and Canada.

Children completed questionnaires at three time points. For purposes of this study’s analyses, there was no T1 (baseline) comparable measure similar to the children with cancer or sickle cell disease. Children with NS were first assessed during active disease (T2), with T3 conducted when they reached complete remission or at 3 months after T2 if remission did not occur and a subsequent follow-up (T4) at 12 months after T2. More study details are provided elsewhere [13].

We hypothesized that T2 would be the worst symptom and poorest functional status for the children experiencing NS activity. We hypothesized that symptom levels and functioning would be better at T3 and T4 follow-up periods when the children were in remission. Peer relationships scores were not expected to change because of change in health status.

Sickle cell disease

A convenience sample of SCD patients was recruited during routine clinic visits at three clinical sites that were part of the same large SCD program (Children’s Healthcare of Atlanta). Eligibility criteria included one or more acute care visits for pain in the previous year. More details on the study are provided elsewhere [14, 15].

Children completed questionnaires at up to four time points. T1 provided a baseline assessment of the child’s health status. T2 assessment occurred at the end of a subsequent hospitalization for a pain exacerbation, 16.6 ± 19.1 months from baseline visit. T3 pain recovery assessment occurred at a median interval of 20 days (range 7–67 days) from their hospitalization assessment. T4 occurred at a subsequent routine clinic visit 1.5 ± 0.56 years following T1. Not all children enrolled in the study provided T2 and T3 data as only 45% of the children experienced a pain exacerbation that led to a hospitalization. Thus, sample sizes for T2 and T3 will be lower than T1 and T4 as not all children experienced a pain exacerbation.

We hypothesized that children with SCD at T2 to have the worst symptom levels and poorest functioning due to the pain exacerbation compared with T1, T3, and T4. The greatest change would occur between T2 and T3 (recovery phase) with pain interference showing the biggest change relative to other symptoms. Because of the short duration of the pain episode, peer relationships status were not expected to change.

Measures

The vast majority of participating children completed computerized-adaptive testing (CAT) versions of the PROMIS Pediatric measures of pain interference, mobility, upper extremity, fatigue, depressive symptoms, anxiety, anger, and peer relationships. The CAT and other data were collected using the Assessment Center platform (https://www.assessmentcenter.net). CAT tailors the questionnaire for the participant by selecting appropriately informative questions based on the individual’s responses to previously completed questions [16]. The result is a reliable assessment with minimal response burden. If the participants did not have access to the web, then the PROMIS pediatric 8-item short form for each health domain was used. PROMIS pediatric measures are scored on a T-score metric with mean of 50 and standard deviation of 10 in the original PROMIS pediatric item bank calibration [17]. Higher scores for symptom measures (i.e., fatigue, pain interference, depressive symptoms, anxiety, anger) indicate worse symptom experiences, whereas higher scores for function measures (i.e., mobility, upper extremity) and peer relationships indicate better functioning or relationships, respectively. In a previous study involving children diagnosed with one of five chronic diseases including cancer, SCD, and NS, a 3-point change was determined to be a minimally important difference (MID) [18]. The MID is defined as “the smallest difference in scores of a PRO measure that is perceived by patients as beneficial or harmful, and which would lead the clinician to consider a change in treatment” [19]. The MID was used to identify meaningful change (responsiveness) scores, as determined by patients, parents, and clinicians, beyond findings from a statistical analysis.

Self-reported demographic data were also collected for each sample. For the analyses reported here, only the demographic data captured in common across the three sites were included. These variables include: gender (female or male), race (white vs. non-white), ethnicity (Hispanic or not), age (continuous), number of comorbid conditions (none vs. one or more conditions), and maternal education (high school or less vs. some college or more). The presence of other health conditions and highest level of maternal education were reported by the parent/guardian of the participating patient and captured on study case report forms.

Analyses

Descriptive statistics were computed for all study variables by disease and by time. The time duration for measurement occasions varied among the three disease types. However, the study design for each disease type consisted of assessment points based on disease events that aligned as indicators of study events, but the time intervals between these assessment points were different based on disease course. Linear mixed models were fit using the lme4 R package [20] for the analysis of longitudinal measures. Each PROMIS outcome (e.g., fatigue, mobility) was modeled independently. Models were adjusted for time, gender, age, race/ethnicity, maternal education, comorbid conditions, and disease type. Additionally, subject-specific random intercepts controlled for heterogeneity between participants. Goodness of fit criteria were used to facilitate model selection, and included AIC [21], BIC [22], and R 2 [23]. R 2 was calculated using the r2glmm R package [24]. Of three possible structures for the time variable (categorical, continuous, or both), semi-partial R 2 statistics [23] indicated that the categorical formulation explained more variability than each of the other approaches. Missing data were assumed missing completely at random (MCAR) based on sensitivity analyses and previous work [15].

With the alignment of events (time points) standardized across the three diseases, T2 is anticipated to be the point of worst symptom experience and functional impact. For primary analyses of responsiveness, lower symptom scores and higher functional scores are anticipated at T3 (recover phase) compared with T2 (event). For secondary analyses, we compare T1 (baseline) and T4 (follow-up) with T2 (event).

Results

The study included 96 children with cancer, 121 children with SCD, and 127 children with NS. Gender was approximately equally distributed in the cancer and SCD groups, but males comprised 65% of the NS group. The child’s mean age was similar across all three disease groups but race differed across the groups. All SCD children were black or African American (except one was mixed race), children with nephrotic syndrome were 28% black, 13% Asian, 8% Hispanic, and 8% other race, and children with cancer were 24% black, 20% Hispanic, and 13% other race. Maternal education was lowest in the SCD group compared to the other two groups (Table 1).

Table 1 Characteristics of children and adolescents with cancer, nephrotic syndrome, or sickle cell disease

Evaluation of responsiveness of the PROMIS pediatric measures

Figures 1, 2 and 3 show the average PROMIS Pediatric scores at T1–T4 for the SCD, cancer, and NS studies. In Fig. 1, peak average symptom scores are at T2, the event, as expected for the Pain Interference and Fatigue measures. In Fig. 2, average emotional distress scores (Anxiety, Depressive Symptoms, and Anger) are higher at T2 than at T3 or T4, and higher at T2 than T1 for the SCD study. For the cancer study, emotional distress is at the highest level at T1. Figure 3 shows the pattern of the functioning scores across T1–T4 for the three studies; for the Physical Functioning and Peer Relationships scores, the trends are more subtle, but there is a tendency for the lowest average functioning scores and relationship score to be at T2, as expected for the function domains.

Fig. 1
figure 1

Upper panel average PROMIS pediatric pain interference scores at T1–T4 for the sickle cell disease, cancer, and nephrotic syndrome studies. The vertical bars around each average are twice the standard error of the means, for approximately 95% confidence intervals. Lower panel as above, for PROMIS pediatric fatigue scores

Fig. 2
figure 2

Upper panel average PROMIS pediatric anxiety scores at T1–T4 for the sickle cell disease, cancer, and nephrotic syndrome studies. The vertical bars around each average are twice the standard error of the means, for approximately 95% confidence intervals. Center panel as above, for PROMIS pediatric depressive symptoms scores. Lower panel as above, for PROMIS pediatric anger scores

Fig. 3
figure 3

Upper panel average PROMIS pediatric physical functioning: mobility scores at T1–T4 for the sickle cell disease, cancer, and nephrotic syndrome studies. The vertical bars around each average are twice the standard error of the means, for approximately 95% confidence intervals. Center panel as above, for PROMIS pediatric physical functioning: upper extremity/dexterity scores. Lower panel as above, for PROMIS pediatric peer relationships scores

Table 2 provides the mixed model results for each of the PROMIS Pediatric symptom domains of Fatigue, Pain Interference, Anger, Anxiety, and Depressive Symptoms. The reference time point was set at T2 for all diseases as this was the point when the child or adolescent was expected to have the worst health status relative to other time points due to chemotherapy (cancer), active disease (nephrotic syndrome), or pain exacerbation (sickle cell disease). Thus, negative regression weights (b) for T1, T3, and T4 in Table 2 indicate symptoms were less severe than T2. From T1 (baseline) to T2 (event), Fatigue (b = −3.1, p < 0.01) and Pain Interference (b = −2.6, p < 0.01) scores significantly worsened. From T2 (event) to T3 (recovery), all symptoms improved: Fatigue (b = −6.4, p < 0.001), Pain Interference (b = −5.5, p < 0.001), Anger (b = −3.3, p < 0.001); Anxiety (b = −4.3, p < 0.001); Depressive Symptoms (b = −3.7, p < 0.001). All symptom mean change scores from T2 to T3 exceeded the MID of 3 points. From T2 (event) to T4 (follow-up), all symptoms except Anger improved (p < 0.01).

Table 2 Mixed model results for PROMIS pediatric symptoms

Table 3 provides the mixed model results for both of the PROMIS Pediatric function domains of Physical Function—Mobility, Physical Function—Upper Extremity, and Peer Relationships. From T1 (baseline) to T2 (event), both function domains and relationships decreased on average (represented by positive regression weights in PROMIS T-score units) (p < 0.05). From T2 (event) to T3 (recovery), the function scores improved: Mobility (b = 3.7, p < 0.001), Upper Extremity (b = 3.1, p < 0.01), Peer Relationships (b = 1.6, p < 0.05). From T2 to T4 (follow-up), the function scores improved (p < 0.05). The physical function domains exceeded the MID of 3 points going from T2 to T3 and from T2 to T4.

Table 3 Mixed model results for PROMIS Pediatric function domains

Discussion

This multi-site, multi-disease longitudinal study provided evidence for the responsiveness of the PROMIS pediatric measures of fatigue, pain interference, anxiety, anger, depressive symptoms, mobility, and upper extremity. Using mixed modeling methods, we were able to combine data from children and adolescents with cancer, nephrotic syndrome, and sickle cell disease and examine how PROMIS scores in a more stable health state compared with scores when the child was experiencing a deteriorating health event, including chemotherapy (cancer), disease activity (nephrotic syndrome), and pain exacerbation (sickle cell disease). The magnitude of change from the event to the recovering phase exceeded the minimally important difference of 3 points [18] for all domains expected to change. These findings are consistent with our hypotheses about expected changes in symptoms and function in the 3 groups. Contrary to hypotheses, peer relationships did significantly get worse from T1 to T2 and got better from T2 to T1, but the magnitude of change was below the MID and was the least affected compared to the other domains. It could be the collective decline of physical functioning and increased symptom burden experienced by children at T2 slightly decreased children’s reported peer relationships.

The findings from this study add to the validity evidence from other studies that have examined the responsiveness of the PROMIS Pediatric measures. In an online cohort of 276 children with Crohn’s disease (ages 9–17 years), children completed self-report measures at baseline and 6 months later including a measure of Crohn’s disease activity and PROMIS measures of pain interference, fatigue, anxiety, depressive symptoms, and peer relationships [25]. Children with improved Crohn’s disease activity from baseline to follow-up reported improved scores (larger than the established MID) on all PROMIS pediatric measures, and children with worse Crohn’s disease activity from baseline to follow-up reported worse scores (larger than the MID) for all domains except Anxiety. In another study, 229 children (ages 8–17 years) from public insurance programs with asthma completed PROMIS pediatric measures of pain interference, fatigue, depressive symptoms, mobility, and peer relationships and measures of asthma control across four time points over 2 years [26]. The study found that children with worsened asthma control and poorer overall health tended to report deteriorated function and more symptom burden on the PROMIS pediatric measures, with fatigue showing the greatest change.

The ability of the PROMIS pediatric fatigue measure to capture the largest changes compared to the other symptom measures is especially meaningful in children. Fatigue has been reported as the most troubling symptom during and following the recovery period by children experiencing a number chronic conditions including cancer, anemia, and surviving organ transplants. This means that the ability of the PROMIS pediatric fatigue measure to capture change is highly relevant to a number of illness groups.

This study had limitations, including limited locality as the cancer data were collected only from a single site and the SCD data were collected from three sites in the Atlanta area. Limited locality may raise concerns about the generalizability of findings. However, we do not have reason to believe that children in these localities with these conditions would vary from others in their response to these questionnaire items. The study was conducted in English language only. Lastly, the assessments did not include the collection of specific occurrence of other life events or stressors which could have had an impact on the PROMIS results beyond the influence of the disease under study.

Conclusions

The PROMIS Pediatric measures, as completed by 8–17 year olds experiencing one or more chronic conditions, are able to measure symptom and functional impact for the affected children and adolescents at specified time points and capture clinically meaningful change in health conditions as hypothesized. This means that these measures are able to quantify the impact of disease and treatment on a child or adolescent and further that these measures are now ready to be embedded into clinical trials for treatment of these diverse chronic illnesses.

The responsiveness of the PROMIS Pediatric measures has been documented here for three different pediatric chronic conditions and in the literature for two additional pediatric chronic conditions that vary in their clinical presentation and in their likely causative factors [25, 26]. In addition, there are ongoing efforts to further evaluate the responsiveness of the PROMIS pediatric measures in additional disease populations. The NIH-funded initiative, validation of pediatric patient-reported outcomes in chronic diseases (PEPR) Consortium [http://grants.nih.gov/grants/guide/rfa-files/RFA-AR-15-014.html] will examine how changes in PROMIS scores are associated with changes in disease status in populations of children with inflammatory bowel disease, cancer, juvenile idiopathic disease, systemic lupus erythematosus, and asthma. These studies will include the PROMIS measures used in this study as well as newer PROMIS pediatric measures.