Introduction

Duchenne muscular dystrophy (DMD) is an X-linked progressive disorder affecting only live male births [1]. DMD occurs in 1 out of every 3,500–6,300 births and is caused by an absence of dystrophin, which is an essential protein found at the inner surface of muscle fibers [1, 2]. Symptoms of DMD include progressive and generalized muscle weakness, which leads to a state of impaired physical functioning [3]. In early childhood, boys with DMD may have frequent falls and experience difficulty climbing stairs. Boys with DMD often become dependent on a wheelchair for mobility at around 10 years of age [46]. These symptoms are detected in early childhood and cause various functional difficulties that affect overall quality of life.

Quality of life (QoL) is defined by the World Health Organization (WHO) as “an individual’s perception of their position in life in the context of the culture and value systems in which they live and in relation to their goals, expectations, standards and concerns” [7]. As a distinct component of QoL, health-related quality of life (HRQoL) focuses on the impact of disease and treatment on disability and daily functioning [8]. As the survival rates of life-threatening conditions, such as DMD, have increased, the focus of medical services has shifted to not only evaluating treatment outcomes, but also HRQoL [9].

The HRQoL of children with disabilities can be measured through both child self-reports and parent proxy-reports. Parent proxy-reports can supplement child self-reports of HRQoL [10], especially when a child is too young or sick to complete the assessment. Research indicates that parents can assume the values and preferences of their child in parent proxy-reports [11]. Furthermore, this research widely supports the idea that child self-reports and parent proxy-reports provide complementary information when measuring the HRQoL of the child [1214]. However, many studies have reported discrepancies between these reports [9], as parents of children with disabilities tend to report lower HRQoL for their children than the children do for themselves [15].

Lack of well-established statistical methods may be one of the reasons for inconsistencies between child self-reports and parent proxy-reports [12]. The Pearson correlation coefficient (Pearson r) together with a t test is frequently used to assess agreement between raters (i.e., children and parents) [6, 12, 16]. However, these methods do not interpret agreement within a single index. Pearson r tests consistency of agreement between raters; a significant correlation indicates that the rank orders between raters are consistent, but it does not mean that the scores stay the same. In contrast, a t test examines magnitude of agreement using mean difference between raters. Therefore, the results of both methods could conflict each other. In other words, the mean scores of child self-reports and parent proxy-reports could show a good correlation even with a statistically significant mean difference (p < 0.05). Conversely, two reports could show a poor correlation, even though the mean difference is not statistically significant (p > 0.05) [17]. The intraclass correlation coefficient (ICC) can examine both consistency and magnitude of the agreement between raters by assessing overall variability based on individual differences [12, 14, 17]. However, the ICC is based on the scale-score level agreement and does not provide information about agreement at the item-level. A weighted kappa could assess agreement on each response scale of an item, but only a categorical scale can be applied [17].

The Rasch measurement model, which is based on item response theory (IRT), can overcome these limitations by applying item-level analysis to item difficulty and person ability measures. Item difficulty indicates the estimated level of difficulty assigned to each item by the respondents, and person ability represents the estimated level of disability of the individual who responded to the item [18]. Previous studies that reported discrepancies between child self-reports and parent proxy-reports of HRQoL were limited in that they only applied the scale-score level analysis (i.e., correlation, ICC, and t test) and did not include item-level analysis (i.e., Rasch analysis) [9, 19]. Therefore, the purpose of this cross-sectional study is to investigate the level of agreement between child self-reports and parent proxy-reports of the child HRQoL using both the scale-score and item-level analyses for boys with DMD.

Methods

Participants

A total of 63 boys with DMD and their parents participated in this study. Boys with DMD and their parents completed a self-report and parent proxy-report of HRQoL, respectively. The boys with DMD were 5–16 years old [mean age 10.2 (2.5)]. Three boys were nonambulatory. Participating parents consisted of both mothers (41 %) and fathers (38 %) from geographically diverse regions of the country (21 % of our parent sample did not report their gender). This study was approved by the University of Florida Institutional Review Board. Written informed consent and assent were obtained from the parent(s) and their child.

Instruments

The pediatric quality of life inventory version 4.0 (PedsQL™ 4.0)

The PedsQL™ 4.0 is a 23-item questionnaire that assesses HRQoL [20]. It consists of a child self-report and parent proxy-report. The domains of the PedsQL™ 4.0 are physical functioning, emotional functioning, social functioning, and school functioning. Each of the subscale items is scored on a 5-point scale (never = 100; almost never = 75; sometimes = 50; often = 25; almost always = 0). The total composite scale score is computed using the mean of the four domains. Scale scores range from 0 to 100, with higher scale scores indicating better HRQoL. The physical functioning domain can be used as a single physical health scale, and the other domains combined can be used as a single psychosocial health scale. The PedsQL™ 4.0 self-report and parent proxy-report established reliability and validity using data from 963 children, including unaffected and chronically ill children, and 1,689 parents [21].

Statistical analysis

Classical test theory approach (scale-score level)

The level of agreement between child self-reports and parent proxy-reports was analyzed using a two-way random model (absolute agreement, average measures) intraclass correlation coefficient (ICC2,2); an ICC of 0.40 and below indicates poor agreement; an ICC of 0.41–0.60 indicates moderate agreement; an ICC of 0.61–0.80 indicates good agreement; and an ICC of 0.81–1.00 indicates excellent agreement [22]. Paired t test and the Pearson correlation coefficient were used to examine the mean agreement and consistency of the ratings between children and parents. IBM’s SPSS version 21 was used for data analysis. The α level was set at 0.05.

Rasch analysis (item-level)

In order to confirm if the data are a good fit for the Rasch rating scale model, we used fit statistic to examine item and person fit. Model misfit was determined with infit >1.4 and outfit >2.0 mean square (MnSq) values and standardized scores >2.0 [23]. The rating scale categories were evaluated according to Linacre’s [24] suggested essential rating scale characteristics for measure stability and measure accuracy: (1) each category needs at least ten observations, (2) the average measure of each category increases monotonically, and (3) each category’s outfit MnSq value is no >2.0. Following the fit statistic test, data were analyzed for item difficulty and person ability measures. Item difficulty represents the estimated level of difficulty assigned to each item by the respondents, and person ability indicates the estimated level of disability of the individual who responded to the item [18]. Item difficulty and person ability are shown as log-odds units or logits. Scatter plots of item difficulty and person ability demonstrated the level of agreement between child self-report and parent proxy-report for the physical health and psychosocial health scale, respectively. The scatter plots set child self-report as the X-axis and parent proxy-report as the Y-axis with a 95 % confidence interval. Winsteps version 3.74 was used for Rasch analysis.

Results

HRQoL mean scores and intraclass correlation coefficient (ICC)

The agreement in the ICC between children and parents was good on the physical health scale and moderate both on the psychosocial health scale and on total composite score (Table 1). Means of parent proxy-reports were significantly lower than means of child self-reports in the physical health scale, psychosocial health scale, and total composite score (p < 0.05). In other words, parents consistently underestimated their child HRQoL. Also, both the mean difference and the Pearson r were significant for all three domains. This result indicated that the ratings of children and parents were consistent in terms of rank order, but they did not obtain the same score.

Table 1 HRQoL mean scores and the level of agreement between children and parents

Rating scale categories

Both the physical health and psychosocial health rating of children and parents met Linacre’s three essential criteria for measure stability and measure accuracy. Each rating category had greater than ten observations, the observed average measures of both scales advanced monotonically, and the outfit mean squares were <2.0 for both scales and for both raters. The rating scale analysis in child self-reports of the physical health scale and parent proxy-reports of the psychosocial health scale are presented as examples in Table 2 and 3. The category characteristic curves presented two patterns; category 1 and 3 did not emerge as more probable than categories 0, 2, and 4, except for parent proxy-reports of the psychosocial health scale (Fig. 1). In contrast, all categories in parent proxy-reports of the psychosocial health scale presented reasonable probability (Fig. 2).

Table 2 Rating scale categories in child self-reports of the physical health scale
Table 3 Rating scale categories in parent proxy-reports of the psychosocial health scale
Fig. 1
figure 1

Category characteristic curves in child self-reports of the physical health scale

Fig. 2
figure 2

Category characteristic curves in parent proxy-reports of the psychosocial health scale

Fit statistics

Item fit

All items fit the Rasch model, both in the parent proxy-reports of the physical health scale and in the child self-reports of the psychosocial health scale. Conversely, 2 out of 8 items showed high infit statistics in the child self-reports of the physical health scale (taking a bath or shower; doing chores around the house). In addition, 2 out of 15 items showed high infit for the parent proxy-reports of the psychosocial health scale (trouble sleeping; keep up with school work).

Person fit

Four out of 63 (6 %) people misfit the Rasch model both in the child self-reports and in parent proxy-reports of the physical health scale. 8 out of 63 children (12 %) displayed misfit in the child self-reports of the psychosocial health scale; 5 out of 63 parents (8 %) misfit the Rasch model in the parent proxy-reports of the psychosocial health scale.

Item difficulty of Rasch analysis in physical health scale

The level of agreement between child self-reports and parent proxy-reports of the physical health scale is presented in Fig. 3. The dashed line represents the 95 % confidence interval (CI), and the identity line represents a perfect agreement between children and their parents on the difficulty of the items. One out of eight items was located outside of the 95 % CI (low energy level). In other words, difficulty ratings between children and their parents were significantly different for this item. Specifically, parents rated the item “low energy level” lower than the child did (i.e., parents perceive their child energy level to be lower than that perceived by the child himself).

Fig. 3
figure 3

Scatter plot of item difficulty in the physical health scale

Person ability of Rasch analysis in the physical health scale

The person ability as self-rated by the child and as proxy-rated by the parent is plotted in Fig. 4. Values on the identity line indicate that the child and his parent reported the same value on the physical health scale. Values above the identity line indicate that the parent rated the child higher than the child rated himself. In contrast, values below the identity line indicate that the parent rated the child lower than the child rated himself. Only 1 out of 63 of our parents (1.5 %) rated their child physical health higher than their child did, while 14 of 63 (22 %) of our parents rated their child physical health lower than their children did. The 95 % CI expanded for the higher- and lower-ability children, indicating that there was increased error at the extremes of the scale.

Fig. 4
figure 4

Scatter plot of person ability in the physical health scale

Item difficulty of Rasch analysis in the psychosocial health scale

The level of agreement between child self-reports and parent proxy-reports of the psychosocial health scale is presented in Fig. 5. Three out of 15 items were located outside of the 95 % CI line. Parents rated two items higher than their child did (not able to do things that other children his or her age can do; keeping up when playing with other children). Conversely, the parents rated the item “missing school because of not feeling well” lower than their children did (i.e., parents believe their child misses school more frequently because of sickness than their child reports).

Fig. 5
figure 5

Scatter plot of item difficulty in the psychosocial health scale

Person ability of Rasch analysis in the psychosocial health scale

The person ability as self-rated by the child and as proxy-rated by the parent is plotted in Fig. 6. Five out of 63 parents (8 %) rated their child psychosocial health higher than their children, while 10 out of 63 (16 %) parents rated their child psychosocial health lower than their child did. The 95 % CI expanded for the higher- and lower-ability children, indicating that there was increased error at the extremes of the scale.

Fig. 6
figure 6

Scatter plot of person ability in the psychosocial health scale

Discussion

This study explored the differences in perceptions on the PedsQL™ 4.0 between child self-reports and parent proxy-reports in boys with DMD using both classical test theory (CTT, scale-score level) and Rasch analysis (item-level). The CTT approach provided an overall agreement for each scale of the PedsQL™ 4.0 (i.e., physical health scale), and Rasch analysis provided evidence at the item-level of the relationship between child and parent ratings. Analyses through CTT determined that child and parent ratings for the physical health scale showed good agreement, whereas the psychosocial health scale showed moderate agreement. In Rasch analysis, the item difficulty scatter plot in the psychosocial health scale showed a slightly larger disagreement than in the physical health scale. Also, the person ability scatter plots for both the physical and psychosocial health scales demonstrated that more parents rated their child HRQoL significantly lower than their child self-rating.

All scales met Linacre’s essential rating scale for measure stability and measure accuracy. However, based on the category characteristic curves, we found three dominant rating scales, which were 0, 2, and 4. Future studies could consider collapsing the 5-point scale to a 3-point scale, except for parent proxy-reports of the psychosocial health scale; this scale showed reasonable probability for all rating scale categories and does not need to be collapsed. Since the items did not radically misfit the Rasch model, we had no strong rationale to remove or modify the items. Future studies could explore the fit with larger sample sizes. In addition, person misfit was not extraordinarily high, suggesting that the sample fit the Rasch model relatively well.

This is one of the first studies that has investigated the level of agreement on HRQoL between child self-reports and parent proxy-reports using both the scale-score and item-level analyses. The findings indicate that child-parent agreement of HRQoL is higher than a former study [6] of boys with DMD that had previously shown moderate to poor agreement (CTT approach). Our study has a larger sample size and fewer nonambulatory boys than Bray’s, which could explain the differing results. Moreover, Bray used a single rater ICC value (ICC2,1), and this provided a smaller ICC than the mean ICC of two raters’ ratings (ICC2,2) [26], which was applied in our study. As previous studies have shown, our study indicates that non-observable factors, such as emotional or social functioning, demonstrate lower agreement than observable factors, such as physical functioning [6, 12, 27].

Findings based on the ICC (CTT approach) were supported by the item difficulty scatter plot of Rasch analysis, indicating better agreement in the physical health scale than in the psychosocial health scale; that is, more items in the psychosocial health scale (three items) fell outside the 95 % confidence interval compared to those in the physical health scale (one item). In the physical health scale, the item “low energy level” showed only a 0.48 logit difference between parents and children, with the item being located near the CI line. In contrast, three items in the psychosocial health scale showed larger difference ranges from 0.85 to 1.27 logit between children and parents. Two of these items were within the social functioning domain (doing things that other children his or her age can do; keeping up when playing with other children). Usually, boys with DMD require significant effort in performing many physical activities and may become tired more rapidly than their peers due to muscle weakness [28, 29]. Even with such physical difficulties, our children perceived their ability to keep up with their peers as less difficult than their parents did. This discordance between parents and children may exist because their ratings are based on different reasoning processes, different response styles, and different interpretations of items [30]. Also, a significant discrepancy was observed between parents and children for “missed school because of not feeling well,” but both the majority of children and parents responded “never” or “almost never,” which indicates general agreement.

In the person ability scatter plots, more parents rated their child HRQoL significantly lower than their child rated his HRQoL for the physical health scale (22 %) and psychosocial health scale (16 %), which is consistent with previous studies [15, 16]. The expansion of the CI was likely due to having fewer child–parent dyads at the extremes of the scale. Our results may indicate that parents underestimate their child HRQoL, because they anticipate a more negative effect from the disability than their child actually experiences [12, 29]. Even though the findings demonstrated that parents may not have enough knowledge about their child non-observable functioning (i.e., emotion and peer relationship), parents are regarded as a crucial informant of their child, and they provide complementary information to the child self-report [12]. Furthermore, a parent proxy-report would be a useful source of information when a child is too young or sick to complete a self-report of HRQoL [10]. However, when a child is able to report his/her own HRQoL, a child self-report is preferable over a parent proxy-report to measure a child HRQoL since HRQoL is based on the individual’s perception of daily life [7].

The Rasch measurement model may provide advantages to investigate individual differences that may be overlooked by the CTT approach. Although our sample size is relatively small for the application of Rasch analysis, it is regarded as a well-targeted sample of 63. This sample size is enough to provide a stable item calibration with a ±0.5 logit and a 95 % confidence interval [31]. By investigating the level of agreement between child self-reports and parent proxy-reports at the item-level, our study seeks to broaden the knowledge regarding the discrepancy of the ratings between parents and children. Moreover, the findings highlight the importance of sharing information between child and parent and may provide further information for health professionals when planning therapy goals and interventions. Future studies should consider conducting in-depth follow-up interviews with children, as well as parents, regarding the items that showed discrepancy between the two HRQoL reports. Additional factors also need to be investigated, such as the child age, disease severity, and parental health status, in order to determine how child and parent factors could affect the level of agreement between child self-reports and parent proxy-reports.