Introduction

Health-related quality of life (HRQOL) has emerged as an important outcome in clinical trials and population health surveys [1]. In China, HRQOL research has made a remarkable progress in adult populations, but studies on children’s HRQOL at a population level are limited [2, 3]. One of the main reasons is a lack of suitable instruments in Chinese that have been developed or adapted according to established scientific criteria and attributes [4].

The Pediatric Quality of Life Inventory Measurement Models (PedsQL™) was first developed by Varni et al. in 1999. It includes a general core scale and several disease-specific modules. Each scale has a set of seven forms that include self-reports for children aged 5–7, 8–12 and 13–18 and proxy reports for children aged 2–4, 5–7, 8–12 and 13–18. The items for each of these forms are essentially identical, differing only in developmentally appropriate language, or in first or third person tense [57]. Up to date, the generic core scale has been revised to the 4th version. Many studies show that PedsQL™ is a reliable, valid and sensitive instrument [810]. The PedsQL™ has been translated into many languages and used in 53 countries and areas in the world [1114], but no Chinese version of PedsQL™ has been developed. In order to supply a cross-cultural valid and reliable instrument for assessing HRQOL of children in China, we translated the PedsQL™ into Mandarin Chinese, following the international guidelines for instrument linguistic validation procedures [15, 16] under the permission from the developer.

This study aimed at evaluating the psychometric properties of the Chinese version PedsQL4.0 generic core scales to determine whether it is suitable for assessing HRQOL of Chinese children.

Methods

Subjects

Healthy and pediatric patients aged 5–18 were recruited in Guangzhou and Shanghai in China. We chose a kindergarten and a primary school in Guangzhou and three primary schools in Shanghai by convenient sampling method and then randomly selected grades in each school. All students in the selected grades were included as healthy children except those self-reporting and later verified by a physician(s) to be suffering from acute or chronic diseases. Pediatric patients were selected by convenient sampling from two triple A hospitals from Guangzhou and Shanghai, respectively. The exclusion criteria were parents being illiterate and being reluctant to participate, and the child being reported to be mentally retarded. The children/adolescents were divided into five subgroups, i.e., healthy children/adolescents, children/adolescents with leukemia, pediatric patients with migraine, children/adolescents with epilepsy, and pediatric patients with Gilles and Tourette’s syndrome.

Caregivers of those healthy children/adolescents and pediatrics patients with leukemia chosen in Guangzhou were considered as the proxy sample.

This study was approved by the Ethics Committee of School of Public Health, Sun Yat-sen University. All subjects signed informed content forms.

Data collection

Four interns and two physicians were trained as interviewers before the formal start of investigation. Among the total 2,918 children/adolescents recruited, 5- to 7-year-old children were interviewed by the interviewers, and 8- to 18-year-old children/adolescents self-completed the questionnaires. Health professionals were available to answer questions. The proxies self-completed the questionnaire.

Measures

In this study, the Chinese translation of the PedsQL™ 4.0 Self-Report and the Proxy Report for ages 5–18 years was used. The scale had 23 items grouped into four subscales: Physical Functioning (8 items), Emotional Functioning (5 items), Social Functioning (5 items) and School Functioning (5 items), and the Psychosocial scale includes emotional, social and school subscale. The questionnaire asked about the frequency of problems that occurred during the past month. Responses were rated on a five-point scale. Items are reverse-scored and linearly transformed to a 0–100 scale, higher scores indicated better HRQOL. Subscale Scores were computed as the sum of the items divided by the number of items answered.

Data analysis

Data were analyzed with SPSS 17.0 for Windows and LISREL 8.70. Feasibility was determined from the response rate, the percentage of questionnaires with some items missing. The scores were presented as \( \bar{X} \pm SD \).

The internal reliability was determined by calculating Cronbach’s coefficient α. Pearson correlation coefficients were used to evaluate the scaling success. Mann–Whitney U test was used to detect the difference among the five groups of children after adjusting the significant level to 0.005 in order to assess construct validity. Confirmatory factor analysis was also performed to evaluate the construct validity of the scaling structure [17, 18]. Intraclass correlation coefficients (ICC) and paired sample t tests were performed to detect the concordance of self-reports and proxy reports.

Results

Subjects

In total, 2,918 children/adolescents aged 5–18 years participated in this study, including 1,583 healthy and 1,335 affected children/adolescents. Among them, 51.92% of the subjects were boys and 48.28% were girls. A total 325 proxies (84.31% mothers, 13.54% fathers, 1.54% others, 0.61% missing) completed the PedsQL™ 4.0. Table 1 presents the information on health status, age and gender distribution of the sample.

Table 1 Distribution of sample according to health status, age and gender

Feasibility

The response rate was 95.0%. For self-reports, there were 1.47% of questionnaires with some items missing, and the mean number of missing items was 1.33. For proxy reports, there were 7.69% with some items missing, and the mean number of missing items was 1.24.

Descriptive analysis

Tables 2 and 3 presents the means and standard deviations of subscale scores for each subgroup.

Table 2 Scale descriptive, reliability and validity for PedsQL™ 4.0 generic core scales for self-reports (\( \bar{X} \pm SD \))
Table 3 Scale descriptive, reliability and validity for PedsQL™ 4.0 generic core scales for proxy reports (\( \bar{X} \pm SD \))

Reliability

The subscales of psychosocial, physical functioning and social functioning showed coefficients above 0.7, and the other two subscales did not for self-report in healthy children and the total pediatric patients, and all coefficients were higher than 0.7 for proxy report for all subscales.

Item-scale correlations

In order to evaluate the item-scale correlations, Pearson correlations between subscale and item scores were analyzed for self-reports and proxy reports. The results showed that each item had moderate to strong correlations with its subscales, which were significantly higher than those with other subscales (P < 0.01).

Construct validity

All scores of healthy children/adolescents were significantly higher than the scores of the other four groups of pediatric patients, with P < 0.001. This indicated that all subscales were able to discriminate between healthy and pediatric patients.

Construct validity was also tested by confirmatory factor analysis by establishing the four-factor model according to the original scaling structure. The goodness of fit results of four-factor model, and one-factor model were shown in Table 4. Compared with the one-factor model, the four-factor model was better in view of the goodness of fit indices.

Table 4 Indices of goodness of fit of one-factor and four-factor models

Self-report/Proxy report concordance

Correlations (ICC) between self-report and proxy report scores for all subscales showed that proxy report scores were highly correlated with the self-report scores (all correlations ≥0.64, ranging from 0.64 to 0.78).

The mean scores of all subscales of children’s self-reports were significantly higher than the scores reported by their proxies, except for that of Social Functioning subscale.

Discussion

The results showed that the internal reliability exceeded 0.70 in all but Emotional and School Functioning subscales for self-report for healthy children and pediatric patients with leukemia or epilepsy. All α coefficients were higher than 0.7 for proxy report for all subscales. It indicated a good reliability of the instrument for proxy report. These findings are consistent with the results seen in the original instrument, which ranged from 0.66 to 0.89 [6, 7]. But in the patients with migraine or Gilles and Tourette’s syndrome, the internal reliability was poor for self-report. It may be disease specific, further study with other samples is needed. Regarding the validity, the results indicated the discriminate ability of this scale was good enough to distinguish the healthy children from other pediatric patients. Similarly, Varni et al. reported that the PedsQL4.0 distinguished between healthy children and pediatric patients with acute or chronic health conditions [6]. Although CFI indicated good structural validity, AGFI and RMSEA did not reach the standard of acceptable construct validity.

There was a moderate to high level of correlation between self- and proxy-reports. The correlation coefficients found in our study is higher than that reported in the original instrument, which ranged from 0.36 to 0.50 [6]. We find that the average scores of self-reports were significantly higher than those of proxy reports except for the subscale of social functioning in our study. This finding is consistent with previous research which indicated that parents and children disagree more on internalizing problems such as anxiety and sadness [1921].

There were some limitations in this study. First, test–retest reliability was not evaluated. Second, 5- to 7-year-old children completed the scale by interviewer administration. Third, the study was only conducted in the large cities of China.

Conclusions

This study is important as being the first study to evaluate the psychometric properties of the Chinese version PedsQL4.0 generic core scales based on a relatively large sample. The data presented here provide reasonable evidence to show that the Chinese PedsQL4.0 has acceptable psychometric properties except the construct validity tested by confirmatory factor analysis and the internal reliability for self-report in pediatric patients with migraine or Gilles and Tourette’s syndrome. Future studies should focus on further testing construct validity and internal reliability for self-report by other samples, evaluating sensitivity and responsiveness in longitudinal studies and assessing HRQOL of children in rural areas.