Introduction

Disease and treatment have physical, psychological, and social effects [1, 2], and survivors of chronic childhood disease often experience psychosocial maladaptation in education, employment, or marriage [36]. As recent advances in diagnosis and treatment methodology have improved the survival rate for several severe childhood diseases, the long-term support of survivors is an increasing challenge for pediatric and young adult health-care professionals [7, 8]. Indeed, in Japan alone, an estimated 50,000 pediatric cancer patients (approximately 1 in 700 young adults) [9] and 400,000 children with congenital heart disease have survived to adulthood [10].

Under such circumstances, follow-up clinics after treatment are needed to check and screen survivors for health and development into adulthood [11], thereby enabling medical and social workers to provide appropriate drug administration, health and psychological education, and social support to survivors and their families. The outcomes of such long-term follow-up can be evaluated based on objective indicators (prevalence of late effects, complications, survival rate, education continuance rate, employment rate) and patient-reported outcomes.

Health-related quality of life (HRQOL) is a patient-reported outcome recognized by clinicians, researchers, and health-care providers as an important multidimensional (physical, psychological, and social) outcome for survivors of chronic childhood disease [9, 1219]. HRQOL can be measured by several scales. For example, the Medical Outcome Study 36-Item Short Form Health Survey (SF-36) [2023] measures HRQOL for adults. Other scales, such as the German Quality of Life Questionnaire (KINDL) [24], are used for children but become less useful for longitudinal assessment as survivors mature. However, few scales measure HRQOL with consistency from childhood to adulthood [25].

The Pediatric Quality of Life Inventory 4.0 (PedsQL) Generic Core Scales are a widely used measurement of HRQOL for children aged 2–18 years [26]. This scale has been validated for children with and without several diseases or disabilities and translated into a number of languages, including Japanese [27]. The PedsQL scales are designed in several formats for different age groups, as follows: children aged 2–4 years (toddler), 5–7 years (young child), 8–12 years (child), 13–18 years (adolescent), and 18–25 years (young adult). Further, a version of the PedsQL Generic Core Scales for young adults 18–25 years was recently developed and validated for application in university students [28] and cancer survivors older than 25 years [29]. A modified form was also used for adolescents and young adults (ages 16–25 years) with cancer or blood disorders [30]. Although each format (for toddler, young child, child, adolescent, and young adult) varies to accommodate differences in lifestyle and cognitive development, the measured content and underlying concepts are consistent for all ages. This relatively wide age range and consistency across ranges allow medical practitioners to design longitudinal investigations of HRQOL. Data obtained using such scales can also be compared across international borders.

Here, we report the development of a Japanese version of the PedsQL Generic Core Scales young adult format (PedsQL-YA-J). We investigated the feasibility, reliability, and validity of the scales among young adults in education, employment, or training. Previous tests of the original English version reported the internal consistency, known-groups validity, and convergent and discriminant validity of the scales [28]. We therefore tested these measures and also the retest reliability, concurrent validity, and factorial validity.

Methods

Scale development

Dr. James W. Varni (JWV), the PedsQL developer, permitted translation of the original PedsQL Generic Core Scales Young Adult Version (PedsQL-YA-O) into Japanese using an approved translation procedure [31]. The draft PedsQL-YA-J was developed from the PedsQL-YA-O and the Japanese version of PedsQL Generic Core Scales for younger age groups (PedsQL-J) [27] with the intention of keeping the wording and content consistent with the PedsQL-J and the original (PedsQL Generic Core Scales for younger ages: PedsQL-O) [26] while being sensitive to age-appropriate differences. The consistency of this translation facilitates the evaluation of differences in HRQOL across and between age groups, as well as tracking HRQOL over time.

Identical items on the PedsQL-O and PedsQL-YA-O scales were identically translated in the PedsQL-YA-J and PedsQL-J scales. The only exceptions arose from Japanese language variations or cultural constraints; for example, ‘the chores around the house’ was translated into ‘ie no naka no koto’ (‘doing things around the house’) for young adults tested on the PedsQL-YA-J and ‘ie no otetsudai’ (‘assisting their parents around the house’) for children on the PedsQL-J to ensure that both phrases described age-appropriate chores. The authors discussed this translation and agreed on a single, reconciled version that was conceptually equivalent to the original version and written in easily understood and age-appropriate language.

Following this Japanese translation, a native English translator proficient in Japanese and blinded to the original version then translated the reconciled version back into English. We produced a pilot questionnaire after comparing the back-translated and original versions and making minor amendments to the reconciled version.

Twelve native Japanese-speaking young adults pilot-tested the questionnaire between August and September 2012. The participants differed in age, gender, and education or employment status. A researcher (MK) measured the time taken to complete the questionnaire and interviewed participants using cognitive interviewing methodology [31] after questionnaire completion to deduce the thought processes used in answering the questionnaire. The data obtained from the pilot test were used to produce a final version of the PedsQL-YA-J. As a result, no words or phrases required modification after the pilot test, and we were able to confirm that the participants of the pilot test understood, interpreted, and answered without difficulty. JWV reviewed the conceptual and linguistic equivalence between the final PedsQL-YA-J and the PedsQL-YA-O.

Study population

We recruited young adults aged 18–25 years (age range covered by the PedsQL-YA-J) from three large companies (current or prospective employees), two smaller companies (current employees), one university (students), and two vocational schools (students) between October and December 2012.

Procedure

We recruited participants from the five companies via collaborators who were members of the alumni association of the School of Health Science, The University of Tokyo. Researchers presented the study details to the collaborators, who then distributed the questionnaires and return envelopes to the participants. Participants provided informed consent via the questionnaires and directly returned the questionnaires by mail to the researchers.

We recruited students from the University of Tokyo in collaboration with professors of various faculties, excluding the faculty of medicine (author’s affiliation). We recruited vocational school students by direct contact with schoolteachers and principals. A researcher or collaborator presented the study to university or vocational school students orally and in writing following lectures and distributed the questionnaires and return envelopes. Participants provided informed consent by completing the questionnaire and returning it immediately in person or later by mail to the researchers. Information regarding non-participants was not collected.

We tested the retest reliability by providing details of the retest procedure in writing to all participants in the first test and asking for volunteers. We sent retest questionnaires and return envelopes to the participants who had provided their address 1–2 weeks after the initial questionnaire and asked participants to return the retest within a week of receipt.

Ethical considerations

The review board of the Graduate School of Medicine and Faculty of Medicine in the University of Tokyo approved the pilot test and main survey (No. 3841 and 3931). All participants were volunteers and returned the completed questionnaires directly (excluding either the company or school administration) to the researchers.

Measurements

The PedsQL-YA-J has four subscales—Physical Functioning (eight items), Emotional Functioning (five items), Social Functioning (five items), and Work/School Functioning (five items)—and is similar to the PedsQL-O, PedsQL-J, and PedsQL-YA-O. Respondents were asked to describe the extent to which each item had troubled them over the past 1 month. A 5-point Likert response scale was used (0 = never [a problem]; 1 = almost never; 2 = sometimes; 3 = often; 4 = almost always). Items were reverse-scored and linearly transformed to a 0–100 scale, where higher scores indicate a better HRQOL. To account for missing data, scale scores were computed as the sum of the items divided by the number of items answered. The total from the 23 items was computed in a similar manner. If more than 50 % of the items were missing or incomplete, the scale score was not computed. Previous reports show the original version has acceptable construct validity and internal consistency (Cronbach’s coefficient alpha [32] = 0.76–0.86).

The performance of the PedsQL-YA-J was compared with the SF-36 and CES-D scales. Both scales have been validated in Japan and are commonly used in the general population [21, 33]. The SF-36 (version 2) is a 36-item instrument that uses three to six category Likert response scales [20] and produces eight subscales (Physical Functioning, Role Physical, Bodily Pain, General Health, Vitality, Social Functioning, Role Emotional, and Mental Health) and two summary scores (Physical Component Summary and Mental Component Summary). Each scale score and summary score are weighted by norm-based scoring methods [21], where higher scores indicate a better HRQOL. We used the SF-36 scale to test validity as the SF-8 (shortened version of SF-36) scale had been used to verify the validity of the PedsQL-O [28].

The Center for Epidemiologic Studies Depression Scale (CES-D) is a 20-item instrument for assessing symptoms of depression [34]. We used the CES-D to test validity, as this scale has been used to verify the validity of the PedsQL-J [27]. In the CES-D, participants indicate how often they experienced symptoms during the previous week on a four-point scale. A total score is calculated where higher scores represent elevated levels of depressive symptoms. In healthy populations, scores <16 are considered normal, while scores ≥16 indicate depression.

Participants were also asked to record their age, gender, first language, educational/employment status (university student, vocational school student, or company employee), working status, living arrangement (living with anyone or alone), subjective symptoms, illness or injury requiring regular medical visits, and subjective opinion of economic status and life. We also added one question to the retest questionnaire: ‘Has a significant event affecting you happened since responding to the initial questionnaire?’

Statistical analyses

All analyses were performed using IBM SPSS software, version 19 (SPSS, Inc., Chicago, IL, USA), and the level of significance was set at 0.05. Score distributions for the PedsQL-YA-J were summarized as mean, standard deviation, minimum and maximum scores, and percentages of floor (0) and ceiling (100) scores, by educational/employment status. We compared mean scores for educational/employment status using Welch’s analysis of variance (ANOVA).

Feasibility was determined based on the time taken to complete the pilot questionnaire and the percentage of missing values. Independence of easily missed items was assessed by Cochran’s Q test. Reliability was assessed based on internal consistency and retest reliability. Good internal consistency was defined as a Cronbach’s coefficient alpha value exceeding 0.70. To determine retest reliability, intraclass correlation coefficients (ICC) between the initial test and retest scores in a one-way random effects model were calculated; an ICC value of 0.40 represented moderate, 0.60 good, and 0.80 high agreement [35]. A paired t test between the initial test and retest scores was used to check whether or not the PedsQL scores had changed.

Validity was assessed based on concurrent validity, convergent and discriminant validity, known-groups validity, and factorial validity. Concurrent validity was checked by calculating Pearson’s product–moment correlation coefficients to confirm that the total score and all scale scores were negatively correlated with the CES-D score. Correlation coefficients of 0.10 represent small, 0.30 medium, and 0.50 large correlations [36]. Convergent and discriminant validity were examined by calculating Pearson’s product–moment correlation coefficient between the scale scores of the PedsQL-YA-J and the predicted scores from the SF-36 scale. We hypothesized that the correlation of the Physical Functioning scale of PedsQL-YA-J would be highest with that of SF-36, the Emotional Functioning scale of PedsQL-YA-J with the Mental Health scale of SF-36, the Social Functioning scale of PedsQL-YA-J with that of SF-36, and the Work/School Functioning scale of PedsQL-YA-J with the Role Physical scale of SF-36 and Role Emotional scale of SF-36. We hypothesized that the Physical Functioning scale of PedsQL-YA-J would correlate with the Physical Component Summary score of SF-36 rather than the Mental Component Summary score of SF-36 and that the Emotional Functioning scale of PedsQL-YA-J would correlate with the Mental Component Summary score of SF-36 rather than the Physical Component Summary score of SF-36.

We calculated Welch’s t test and 95 % confidence intervals between groups to describe known-groups validity and predicted that the Physical Functioning scale score and total score would be low among young adults who had subjective symptoms that the Emotional Functioning scale score and total score would be low among young adults who scored CES-D ≥16 and that the Physical Functioning scale, Work/School Functioning scale, and total scores would be low among young adults who had illness or injury requiring regular medical visits.

We conducted an exploratory factor analysis using the principal factor method and promax rotation. The number of factors was determined so that the discriminant criteria had eigenvalues of 1.0. We hypothesized a five-factor model, being the same as PedsQL-O [26], where two items (‘Hurt or ache’ and ‘Low energy’) in the Physical Functioning scale had factor loadings with Emotional Functioning, thus producing one combined factor. The remaining six items of Physical Functioning scale represent one factor. Work/School Functioning items were split into two factors (the first three items of function and the next two items of absence), with Social Functioning considered as one factor.

Results

Sample characteristics

We distributed questionnaires to 842 participants, and 459 (54.5 %) were returned. Thirty-one of these questionnaires were excluded for the following reasons: (1) the questionnaire was incomplete, (2) participant was aged <18 or >25 years, (3) participant’s first language was not Japanese, (4) participant’s answers to all 20 items in CES-D were either all ‘0’ or ‘3’ (as the CES-D scale allows positive and negative items, questionnaires returning all ‘0’ and all ‘3’ answers were classified as insincere), or (5) participant’s answers to all 36 items in SF-36 were identical (similar reason to (4)). A total of 428 (50.8 %) questionnaires were analyzed.

The median age of the participants was 20.0 years (Table 1), and the sample included 244 university students (57.0 %), 151 vocational school students (35.3 %), and 33 company employees (7.7 %). Most company employees were university graduates (n = 24 of 33, 73 %), and others had completed graduate school (n = 3, 9 %), vocational school (n = 3, 9 %), or senior high school (n = 3, 9 %).

Table 1 Subject characteristics

Scale descriptions

The values for all scales fell in the possible range of 0–100 (Table 2). Nearly half of participants (47.0 %) reported the maximum possible score in the Social Functioning scale, and 34.8 % reported a ceiling effect in the Physical Functioning scale. No floor effect was observed. The mean of the total score and the four scale scores were similar (η 2 = 0.003, P = 0.569 of total score, η 2 = 0.005, P = 0.375 of Physical Functioning, η 2 = 0.012, P = 0.096 of Emotional Functioning, η 2 = 0.006, P = 0.251 of Social Functioning, and η 2 = 0.009, P = 0.252 of Work/School Functioning) across all educational/employment statuses.

Table 2 Score distribution of the Japanese version of the PedsQL Generic Core Scales Young Adult Version

Feasibility

Participants took 1–5 min to complete the PedsQL-YA-J pilot questionnaire. Of all 459 returned questionnaires (including 31 exclusions), an average of 1.2 % of items were incomplete, and missing items were independent of each other (P = 0.411).

Reliability

The scales were internally consistent for all participants and in each group by educational/employment status (Table 3). We distributed retest questionnaires to 115 participants, 74 of whom (64.3 %) returned the retest questionnaires. Fifteen questionnaires were excluded because (1) the participant reported a significant event had affected them since the initial questionnaire, or (2) they returned the retest after the prescribed interval.

Table 3 Reliability of the Japanese version of the PedsQL Generic Core Scales Young Adult Version

A total 59 (51.3 %) questionnaires were analyzed for retest reliability. The interval between the initial test and the retest was 8–21 days (median = 16 days). A comparison of respondent characteristics for retest and non-retest participants (Fisher’s exact test or Welch’s t test) showed that retest participants tended to be older (mean age = 21.4 years, P = 0.01) and employed (n = 12 [20 %], P = 0.01). The initial scores of the retest sample were similar to those of the non-retest sample, and the total as well as each retest scale score were similar to the initial test scale scores (Table 3), indicating moderate to good agreement between the retest and initial scores.

Validity

The PedsQL-YA-J scores were concurrently valid against the CES-D, with Pearson’s correlation coefficient for each score as follows: Physical Functioning (−0.43), Emotional Functioning (−0.57), Social Functioning (−0.46), Work/School Functioning (−0.50), and total (−0.61) (all P < 0.001).

Our hypothesis that the correlation would be highest between PedsQL-YA-J Physical Functioning scale and the SF-36 Physical Functioning scale, the Emotional Functioning scale with the Mental Health scale, and the Work/School Functioning scale with the Role Physical scale and Role Emotional scale (Table 4) was confirmed. However, the PedsQL-YA-J Social Functioning scale correlated better with SF-36 Role Emotional scale and Mental Health scale than with the SF-36 Social Functioning scale. The Physical Functioning scale correlated better with the SF-36 Physical Component Summary score than with the Mental Component Summary score, and the Emotional Functioning scale correlated better with the Mental Component Summary score than with the Physical Component Summary score.

Table 4 Convergent and discriminant validity of the Japanese version of the PedsQL Generic Core Scales Young Adult Version (Pearson’s correlation coefficients for each scale)

Validation for known-groups showed that the Physical Functioning scale was sensitive to subjective symptoms, the Emotional Function scale was sensitive to depression, and the Work/School Functioning scale was sensitive to illness or injury requiring regular medical visits (Table 5). The Physical Functioning scale was insensitive to illness or injury requiring regular medical visits, although the results showed a 1.6-point reduction (95 % confidence interval = −1.3 to 4.6) to illness or injury requiring regular medical visits. The total score was sensitive for subjective symptoms, depression, and illness or injury requiring regular medical visits.

Table 5 Known-groups validity of the Japanese version of the PedsQL Generic Core Scales Young Adult Version

Exploratory factor analysis revealed a structure with six factors, consistent with the assumed structure, supporting factorial validity (Table 6). Two items in the Physical Functioning scale that had factor loadings with Emotional Functioning scale supported the hypothesis. The remaining items of the Physical Functioning scale were divided into two factors, as follows: the first of items 1, 5, and 6, and the second of items 2, 3, and 4. The cumulative proportion was 57.0 %.

Table 6 Factorial validity of the Japanese version of the PedsQL Generic Core Scales Young Adult Version

Discussion

The results show that the translated PedsQL-YA-J is a feasible, reliable, and valid measure of HRQOL for healthy young adults. None of the four scale scores or the total score differed significantly between the three educational/employment groups (university, vocational school student, and company employee). All four scale scores and the total score were internally consistent regardless of educational/employment status. These results confirm the broad range of application and psychometric properties of the PedsQL-YA for young adults.

Most participants in this study were students in universities or vocational schools, and the mean age (20.1 years) of the participants was in the lower half of the target age range (18–25 years). When our results were compared with those of the Japanese national survey Comprehensive Survey of Living Conditions [37], the proportion of participants reporting subjective symptoms was 15 % higher than that in the previous survey. However, the proportion reporting illnesses and injuries requiring regular medical visits was approximately equal.

Given the socio-economic status of the participants, most are likely to engage in everyday social life. When the scores of university students in the present study were compared with other data [28], the values from the present study were higher in all scales. The greatest difference was found in the Emotional Functioning scale, which was 15.4 points higher than that in previous studies. These findings indicate that the HRQOL of the participants was very good, possibly explaining why the scores for the Physical Functioning and Social Functioning scales showed significant ceiling effects. The observed ceiling effects, particularly of the Physical and Social Functioning scales are consistent with those of relatively healthy patients exhibiting ceiling effects [30]. The Physical Functioning scale can discriminate physically distressed young adults from healthy ones, although a large sample size is required to discriminate physically fit young adults from those of average fitness. Further, Robert et al. [29] reported the ceiling effect of the Social Functioning scale and discussed its limitation to peer-based relationships and need to be broadened to include romantic relationships for use with adults (aged 25 years or over). Although we concluded that the items on this scale were appropriate for assessment of social functioning continuing longitudinally from childhood, future users should take into account the limited range of relationships for adults.

Our present data showed that the feasibility of the PedsQL-YA-J was high, given that participants were able to complete the questionnaire quickly and the percentage of missing answers was low. The items and format of the PedsQL-YA-J appeared to be easy to understand and answer. The PedsQL-YA-J reliability was also high. Cronbach’s alpha coefficients for total and subscales scores were ≥0.70, and an α coefficient ≥0.90 for the total score confirmed high internal consistency.

We noted no significant differences in scores between participants completing or not completing the retest, indicating that the data from the retest participants could be used to calculate retest reliability. Although the ICC values showed moderate to good agreement, they did not reach the general standard of reliability (≥0.70) [25, 38]. The relatively low ICC of the Social Functioning scale compared with the other scales is consistent with the result of a study using the Persian version of the PedsQL-YA for patients with rheumatoid arthritis in Iran [39]. We consider the markedly low ICCs noted with the Physical and Social Functioning scales to be due to the ceiling effect. The reliability of the PedsQL-YA-J is supported by internal consistency.

Content validity was assessed via cognitive interview with participants after questionnaire completion, discussion between the authors with clinical experience in the pediatric area and research experience in scale development, and confirmation by JWV. We confirmed that the Japanese translations of the questions were clear and appropriate for the young adults in the target age range (18–25 years) that the concepts in the scales were understandable and could be answered and concluded that the PedsQL-YA-J and PedsQL-YA-O [28] were equivalent and consistent with PedsQL-J [27].

The total and four subscale scores for the PedsQL were all correlated with the CES-D results indicating concurrent validity. Convergent scales in PedsQL and SF-36 were also correlated, thus verifying convergent and discriminant validity of the PedsQL-YA-J compared with the SF-36 scale, except for the Social Functioning scale. The PedsQL-YA-J Social Functioning scale correlated better with the SF-36 Role Emotional and Mental Health subscales than with the SF-36 Social Functioning scale. The PedsQL and SF-36 questionnaires might therefore evaluate different aspects of social health and function. In particular, the PedsQL Social Function scale evaluates the difficulty in building a relationship with others by asking questions such as, ‘Do you have difficulty getting on with peers?’ In contrast, the SF-36 Social Functioning subscale evaluates social activity via questions such as, ‘In the past month, how much everyday socializing with your family, friends, and neighbors was disturbed for physical or psychological reasons?’ The PedsQL framework was designed to evaluate a patient’s social development and ability to build relationships with others. In contrast, SF-36 was designed for patients older than 15 years of age [21] and focuses on the nature of adult social activity. These differences may reduce the correlation between the PedsQL and SF-36 Social Functioning scales.

We confirmed the known-groups validity of the PedsQL Physical Functioning, Emotional Functioning and Work/School Functioning scales were sensitive to subjective symptoms, depression tendencies, and regular medical visits, respectively. Meanwhile, we did not observe any significant differences between the scores for ‘injury or illness requiring regular medical visits’ and the Physical Functioning score. This may be because we did not request details regarding the severity of injuries, illness, or disease requiring medical visits, and participants might have reported minor injuries or illness which did not affect their physical functioning.

Although the factorial analysis identified a six-factor structure (one more than the five factors hypothesized), we do not recommend using the PedsQL-YA-J with different scoring methods from the PedsQL-O, PedsQL-J, and PedsQL-YA-O. Work/School Functioning items split into two different factors hypothesized a priori, being the same as the PedsQL-O [26] and the modified PedsQL-YA-O [30]. Both factors (work/school functioning and missing) might therefore act together as indicators of Work/School Functioning from different perspectives.

Consistent with our hypothesis, two items split from the Physical Functioning scale, which was similar to the result of a previous study of the PedsQL-O among healthy children [26, 30]. However, of note, the remaining six items of the Physical Functioning scale acted as two factors, where the questions for first, fifth, and sixth items assessed the ability to walk, bathe, or undertake household chores, compared with those in the second, third, and fourth items which assessed the ability to participate in running, jogging, or sports. The former questions identify common tasks that are less strenuous than the latter, and the separation into two groups probably arises because the participants were healthy young adults without difficulty in normal social life. Given that Ewing et al. [30] reported a different factor structure (six items in one factor) for the modified PedsQL-YA among young adults with cancer or a blood disorder, our present finding may be sample specific. To our knowledge, this is the first report on the structure of factors for the PedsQL-YA in healthy young adults.

Several limitations to the present study warrant mention. First, given that most participants were young and in good health, PedsQL-YA-J data obtained from hospitalized patients with illness or disorders or people without everyday social lives should be interpreted with caution. Participants were also drawn from a limited range of socio-economic groups, and the scale should therefore be tested with other groups before application to other populations. Future studies and analyses are needed to explore the sensitivity and responsibility of the PedsQL-YA-J and its factor structure among young adults with chronic disease. Furthermore, longitudinal studies to monitor children with health problems as they move into young adults are needed.

However, we consider the advantages of this study to include retest reliability, concurrent validity and factorial validity, which were not assessed by the original version [28]. Although the initial PedsQL-YA-O development was restricted to university students [28], our study tested a broader population sample including vocational school students and company employees.

Conclusion

Here, we report that the PedsQL-YA-J is a feasible, reliable, and valid method for assessing HRQOL among young adults in education, employment, or training, or for clinical trials and epidemiological research. This scale measures HRQOL consistently from children to young adults and can thus be used for longitudinal assessments and long-term follow-up studies. Longer-term data for HRQOL will help support the survivors of chronic childhood diseases.