FormalPara Key Points

This paper critically examines the methodological considerations involved in the valuation of child- and adolescent-specific preference-based measures.

The paper concludes that while the choice of whose preferences and which perspective to use is a matter of normative debate, and ultimately for decision by reimbursement agencies and policy makers, greater research around these issues would be informative and enrich these discussions.

Gaps in research evidence are identified, including the anchoring of adolescent preferences for the calculation of quality-adjusted life-years, and the generation and use of combined adult and adolescent preferences.

1 Introduction

Economic evaluation is increasingly used to inform resource allocation decisions in healthcare, often assessing benefits using quality-adjusted life-years (QALYs) or disability-adjusted life-years (DALYs). The methodology for assessing interventions and measuring and valuing health benefits in adult populations for economic evaluation is well developed, including detailed guidance from many international agencies (for example, see the National Institute for Health and Care Excellence [NICE] guidelines to the method of technology appraisal [1] and the Pharmaceutical Benefits Advisory Committee [PBAC] guidelines for preparing submissions to the PBAC [2]), and good practice guidelines, for example International Society for Pharmacoeconomics and Outcomes Research (ISPOR) guidance [3, 4]. However, the methods for assessing interventions for child and adolescent populations in particular often lack detailed guidelines, or implicitly assume that what is recommended for adults is also what is most appropriate for children and adolescents, despite there being special considerations for children (for example, see Ungar [5]). One important aspect relates to the valuation of health and/or quality of life for child and adolescents for use in health technology assessment (HTA) and economic evaluation, in particular to generate QALYs.

The quality-adjustment weight of the QALY is often generated through application of a preference-based measure accompanied by off-the-shelf utilities. Preference-based measures can be generic or condition-specific, as well as population-specific, including child and adolescent measures and adult measures. A child and adolescent preference-based measure is designed to measure and value the health of children typically aged from around 7–17 years (specific target ages vary between measures and there are cases where they are used from 4 years of age, for example the Child Health Utility 9D (CHU9D) [6]). Adult measures are generally designed to measure and value the health of adults aged 18 years onwards. Some measures are intended for use in children, adolescents and adults [for example, the Health Utility Index (HUI) 2 and HUI3].

Child and adolescent measures differ to adult measures in important ways (see Matza et al. [7] for an overview). For example, child and adolescent measures may regularly need to be proxy-reported by carers, as well as self-reported, since younger (aged < 7 years) and intellectually impaired respondents may be unable to self-report their own health. This raises important considerations around the classification systems used to measure health (for further discussion see Ungar [8], Prosser et al. [9], Petrou [10] and De Civita et al. [11]); for example, content that must be appropriate and understandable as the person ages, and suitable for both self-report and proxy report (for example see Pickard and Knight [12] for issues around self- and proxy-reporting).

However, one potential key difference between adult and child- and adolescent-specific preference-based measures relates to their value sets, the scoring to generate utilities for economic evaluation, which is the main focus of this paper. The valuation of any preference-based measures requires methodological decisions: whose preferences, which perspective, elicitation technique and mode of administration. If the elicitation technique does not produce scores onto the QALY scale, then methods to anchor onto the 1–0 full health–dead scale required for economic evaluation are needed. Some of these methodological questions differ for child and adolescent measures in comparison with adult measures, and while many can be informed by research, some are normative and ultimately require a value judgement. Some international agencies also have requirements around the methods used to generate value sets for measures used in HTA (for example, see the NICE Guide to the methods of technology appraisal [1]). The issue of comparability with adult utilities, and consistency of technology assessments across conditions and populations, should be considered. It is also important to consider the comparability with adult utilities within a condition and within a cost-effectiveness model as QALYs may include impacts on length of life and quality of life from childhood into adulthood.

The aims of this paper are to (1) identify current available child- and adolescent-specific generic preference-based measures; (2) summarise and provide a critical assessment of the methodological considerations in the valuation of child- and adolescent-specific preference-based measures; (3) review the existing literature on generating value sets for child and adolescent preference-based measures; and (4) identify current gaps in research evidence and methods regarding the valuation of child- and adolescent-specific preference-based measures.

2 Child- and Adolescent-Specific Generic Preference-Based Measures

A recent review [13] of generic multi-attribute preference-based instruments in paediatric populations has identified and provided an overview of the following measures, which are summarised in Table 1: Adolescent Health Utility Measure (AHUM) [14]; Assessment of Quality of Life-6 Dimensions (AQoL-6D) [15, 16], CHU9D [6, 17,18,19,20,21,22,23,24,25], EuroQol-5 Dimension Youth (EQ-5D-Y) [26,27,28,29], HUI2 [30, 31], HUI3 [32], Quality of Well-Being (QWB) [33], 16-Dimension (16D) [34], and 17-dimension (17D) [35],

Table 1 Summary of the classification systems of child- and adolescent-specific generic preference-based measures

This review focuses on child- and adolescent-specific generic preference-based measures, although there are some examples of child- and adolescent-specific, condition-specific, preference-based measures (for example for dermatitis [36] and asthma [37]), with others also in development.

3 Methodological Considerations in the Valuation of Child- and Adolescent-Specific Measures

This section outlines the main issues and critically assesses the options available to researchers, clinicians and other key stakeholders. Decisions relating to valuation may be influenced by the measure under consideration or by recommendations from reimbursement agencies. However, several important methodological considerations in the valuation of child- and adolescent-specific measures can be informed by economic theory and research, for example by identifying good practice through understanding the strengths and limitations of different approaches when applied in different modes of administration, to different populations, using different perspectives. Table 2 presents an overview of key methodological considerations in this context.

Table 2 Considerations and study characteristics in the valuation of child- and adolescent-specific measures

3.1 Whose Preferences

Utilities that are used to generate the value set for preference-based measures can be elicited from adults (members of the general public, parents, or healthcare professionals), young adults, adolescents and children. The choice of whose preferences are important, along with research, has shown that different populations provide different preferences [22, 38, 39], and arguments can be made to involve the differing perspectives of both child and adult preferences in medical decision making [40].

3.1.1 Adult Preferences

Adult preferences can be advocated on the grounds that adults ultimately fund healthcare through taxation, and hence their preferences should be used to determine how healthcare resources are allocated. Value sets for preference-based measures for adults are typically generated using adult general population preferences elicited for hypothetical health states, and hence it can be argued that the elicitation of adult preferences for child- and adolescent-specific preference-based measures can provide comparability in the methodology used to elicit preferences for adults, children and adolescents. However, while this provides comparability in the population used to elicit preferences, it does not guarantee comparability in the utilities that are elicited; for example, see Sect. 3.2. The comparability in methods, but not the resulting utilities, can generate issues for HTA where utilities are modelled over time as the patient ages from childhood through adulthood.

In general, it may be argued that adults have a greater understanding of preference elicitation tasks than children and adolescents, which can be cognitively demanding both in terms of understanding the task and being able to make a choice (although this will differ at the individual level) [19]. In addition, it is widely regarded as being ethically acceptable to ask adults to compare health states to being dead, without causing unnecessary distress. However, while adults may have greater cognitive understanding of the tasks, they may not understand the child and adolescent health states and their impact, and this is something discussed further below regarding perspective. In addition, previous research has demonstrated that adult preferences can differ to child and adolescent preferences [22], therefore utilities derived from adult preferences should not be viewed as interchangeable with those derived from children and adolescents.

3.1.2 Child and Adolescent Preferences

Child and adolescent preferences can be argued for on the grounds that it is children and adolescents who experience the health states, and some institutions regard adolescent views as an important consideration for any assessment of health interventions [41,42,43,44]. However, younger children aged around 7–10 years are unlikely to fully understand the tasks and are unlikely to be able to make a choice. The ability to understand and choose is not only impacted by age but may also be impacted by educational ability, experience of ill health, and sociodemographic characteristics, meaning some younger children may be able to undertake these tasks, while some older children may be unable to undertake them [45]. The type of elicitation approach adopted, the number of tasks that are presented, framing of questions, the complexity of wording, the number of dimensions in health states, and health state selection for valuation (and comparisons) may also affect the difficulty of the tasks (for an example of how methodological choices may impact see Stevens [45]). Presentation and design can be tailored to the population asked to value health states, to ease comprehension and reduce difficulty; for example, colour coding to highlight differences/similarities, boldening/graying of severity levels, and allowing dimensions to vary for only a subset of dimensions within or between tasks (for an example of these types of approaches in an adult population, see Norman et al. [46]). Research has found internally valid responses for adolescents valuing hypothetical health states using best–worst scaling and discrete choice experiment (DCE), suggesting that an appropriate selection of task, design, framing and presentation can be used to elicit adolescent preferences where respondents have good understanding and make reasoned choices [22, 47, 48]. It should be noted that when applying the best–worst scaling approach in the valuation of CHU9D states, worst choices were far less consistent than best choices [22]. This tendency was also evident in the valuation of CHU9D health states using an adult sample, but was found to be more prevalent in adolescents. However, such a phenomenon was not observed in the valuation of EQ-5D-Y health states in different samples of adolescents and adults in two countries [48]. Other research examining the elicitation of preferences for hypothetical health states has found that children aged 10–17 years can complete best–worst scaling tasks, and children aged 14–17 years can undertake pairwise comparison tasks [45].

Questions have been raised around the acceptability and appropriateness of asking preference elicitation tasks that involve consideration of the state of being dead with adolescents. This raises two issues. First, whether adolescents are able to understand and make reasoned choices in questions involving consideration of being dead; and, second, whether the use of elicitation techniques involving consideration of being dead would cause distress or upset for adolescents and therefore cause concerns for research ethics committees. Some studies have been undertaken involving consideration of being dead with adolescents [49], suggesting that if appropriate design and framing is used, these tasks may be appropriate, and further guidance for ethics committees is required for this to be an option pursued in the future as currently there is little guidance on these issues.

The inability of younger children to value health states raises the issue of whether it is more acceptable for adolescents than adults to value health states experienced by young children. Either argument can be made around whose preferences should be used to value health states for young children, but for these children, their own preferences cannot be taken into account, meaning that it is a normative decision around whose preferences to use.

3.1.3 Hypothetical Preferences, Experience-Based Preferences or Patient Preferences

Preferences can be elicited for hypothetical health states, where people imagine health states, termed hypothetical general population preferences, and these could be provided by general population adults or adolescents. However, it is possible to ask adolescent patients in ill health to value hypothetical health states, which is referred to as patient preferences. Another alternative is to ask adolescent patients in ill health to value their own health state, which generates experience-based preferences. An experience-based value set has been estimated for the EQ-5D-Y in Canada, which estimates a regression with own visual analogue scale (VAS) as the dependent variable and the EQ-5D-Y classification system as the independent variables from respondents aged primarily 10–11 years [50], although note that this uses a 1–0 scale, where 1 equals best state and 0 equals worst state. There are theoretical and practical arguments around the advantages and limitations of both experience-based [51] and patient preferences [52] that have been discussed for adult utilities, and many of these arguments are likely to apply for child- and adolescent-specific preference-based measures.

3.1.4 Combined Preferences

One option is to extend the definition of the general population to include adolescents when valuing health states, to generate a value set that combines both adult and adolescent preferences together. Since adolescent and adult preferences may differ, sampling strategies around age and sex would need to be carefully considered to achieve an appropriate sample. Alternatively, both adolescent and adult value sets could be generated, and both used to inform analyses (for an analogous argument for general population and patient preferences see Brouwer and Versteegh [52]; this is also relevant for the Second Washington Panel on Cost Effectiveness [53]); however, careful consideration of the appropriate elicitation technique and perspective would be required.

3.1.5 Informed Preferences

Informed preferences have been used in the elicitation of adult utilities as a way of obtaining preferences from the general population that are more informed about what it is like to live in ill health, using information from patients experiencing health states [54]. This technique could be used to provide adolescents with more information about what it is like to experience ill health, since their experiences of ill health may be limited, or could be used to provide adults with more information about how ill health impacts on children and adolescents when they are valuing health states in the context of imagining what it is like for a child (see Sect. 3.2 below). For example, information that is provided could involve child and adolescent experience-based preferences (see Sect. 3.1.3), or child and adolescent patient preferences (see Sect. 3.1.2). This is not something that we are aware has been undertaken in the literature and further research may be worthwhile.

3.2 Perspective

In hypothetical health state valuation tasks, participants are asked to imagine someone in a health state and to indicate how good or bad the health state is for that person. The term ‘perspective’ is used to indicate who the person is that they are imagining is experiencing the health state; for example, the person could be themselves, a child, or another adult. The elicitation of preferences from adolescents would usually involve valuation from their own perspective, where they are imagining that they are experiencing the health state; however, adolescents could be asked to value health states experienced by someone else (an ‘other’ perspective), but this is likely to be more cognitively challenging.

The elicitation of preferences from adults can involve multiple different perspectives, i.e. own health as an adult; health state for themselves as a child; health state in the context of a child at a specified age; and health state for another adult.

‘Own perspective’ for adults can be argued for on the basis that the adult is under a ‘veil of ignorance’ where they do not know who is experiencing the health state, and hence the value they provide is not influenced by any views around children or child health. It can be argued that this provides comparability with the methodology used to elicit hypothetical adult preferences for adult health states. In addition, if child health is valued more highly by society than adult health, this can be taken into account in the resource allocation process, using, for example, QALY weighting or deliberation, where there is no risk of double counting as the utilities are not in any way influenced by participants’ preferences around child health. However, the classification system of child- and adolescent-specific preference-based measures may involve terms that are inappropriate for adults, for example CHU9D mentions homework and schoolwork in one dimension (although there is an adult version that instead refers to work [22]). If these were to remain in their original wording, this would likely cause confusion and a lack of engagement, and would lead participants to the view that they are being asked to imagine themselves as a child. Alternatively, some dimensions can be reworded, meaning that the definition of this dimension is not analogous to the aspect of health-related quality of life that the child or adolescent are reporting using the classification system, creating a discrepancy in what is valued in the value set and what is reported using the measure [24, 55]. Another example is daily routine, where although the dimension would not be reworded in a valuation task, a child’s daily routine will differ to the daily routine adults imagine for themselves.

Adults could be asked to imagine the health state in the context of a child of a specified age, where often a 10-year-old child is specified, although this could be any age. However, the child that the participant imagines may matter; for example, whether it is their own child, grandchild, or child they have strong feelings about, or a child they do not know. These preferences may be influenced by participants’ views about children and child health, meaning that the elicited preferences may include not only how good or bad the health state is but also how good or bad it is that the child they are imagining is in this state of ill health. It can be argued that the use of these preferences to inform policy, for example to generate QALYs for HTA, should take this into consideration since any QALY weighting or deliberation that gives a higher weight to child health relative to adult health may be double counting. There is also the issue around the age of the child that adults should be asked to imagine. There is a possibility that the age of the child participants are being asked to imagine impacts on preferences; this is an area currently under research.

Adults could be asked to imagine the health states for themselves as a child, but this is prone to recall bias as they will not be able to accurately recall what it was like to be a child. Their preferences may also be influenced by views around child health, their childhood, and their experiences as a parent/guardian if they have children.

3.3 Elicitation Technique and Mode of Administration

Table 2 outlines the different preference elicitation techniques that can be used in studies eliciting valuations from adolescents and adult populations: best–worst scaling; DCE; ranking; rating scale/VAS; DCE with duration; time trade-off; and standard gamble. Each of these elicitation techniques is theoretically plausible for use with adolescents and adults, although there may be ethical and practical concerns around the acceptability and appropriateness of the use of some of these techniques in adolescents.

Best–worst scaling, ranking and DCE are all ordinal techniques that provide relative weightings of dimensions and severity levels, and are all generally considered as being easy to understand. These methods do not require any consideration of being dead, and are therefore considered ethically acceptable and appropriate for use in adolescents. However, all these methods only generate anchored preferences onto the 1–0 full health–dead scale if there is mention of being dead and the duration of health states. For example, in DCE with duration, this is achieved by including duration as an additional attribute [56, 57] (see Sect. 3.4). VAS tasks do not require the inclusion of dead as a state in the task, but if dead is included, the generated preferences can be directly anchored onto the 1–0 full health–dead scale.

Best–worst scaling has been criticised in the literature when used to value health states in adults, and a small number of studies have found that the preferences it generates differ to other elicitation techniques [58, 59], although further research studies examining this are recommended. DCE may be cognitively challenging, particularly where there are several dimensions of health and where these vary across the profiles within a choice set. Ranking over a large number of health states can become laborious and time-consuming, with a large amount of reading and recall of the other states each state is being ranked alongside. VAS has been criticised in the literature as it does not involve sacrifice or opportunity cost, meaning that it may not accurately reflect the value of a health state, although there is no consensus on this issue [60]. Participants have been found to spread the set of states (or dimensions) they are valuing across the scale, meaning that the value of states can be impacted by the states they are valued alongside, avoid the ends of the scale, and display a tendency to prefer numbers ending in 5 or 0 (50, 55, 60) [61], although digit preferences can also be observed using other cardinal elicitation techniques; however, in VAS valuation studies, the impact of these may be reduced through careful design.

Time trade-off, standard gamble and DCE with duration are cardinal techniques that generate utilities on the 1–0 full health–dead scale. These techniques involve imagining being dead, and, as discussed above, questions have been raised around the acceptability and appropriateness of asking adolescents to complete these tasks. An option to remove the consideration of dead is chained time trade-off or chained standard gamble, where an impaired health state is valued relative to a worse health state, with no mention of dead. The utility for the impaired health state is then anchored onto the 1–0 full health–dead scale using the utility for the worse health state, which is elicited using standard time trade-off or standard gamble; these utilities could be elicited from adults (see Sect. 3.4 for a discussion of some of the issues this raises). To our knowledge, DCE with duration has not been undertaken with adolescents and may be too cognitively challenging since it involves both trading between length of life and health and simultaneously considering multiple profiles of health. DCE with duration will not generate appropriate responses if respondents do not trade between length of life and health, and hence this should be established prior to the use of this technique. Standard gamble involves consideration of risk, and adolescents may have different attitudes to risk than adults, which could impact on elicited standard gamble preferences. Time trade-off is often used to generate value sets for adult preference-based measures, and the use of this technique may provide greater comparability of methods used to generate adult value sets for these measures, provided this can be used appropriately given the methodological choices of whose values and which perspective to use in the valuation survey.

The choice of perspective combined with technique should be carefully considered since this can impact on preferences. Research using VAS has shown that adults valuing health states from the perspective of a child of a specified age can generate lower utilities than adults valuing health states for themselves [62]. However, the reverse has been found using time trade-off, where participants trade between health and length of life to indicate their preferences for health states, where utilities elicited using an adult’s own health perspective can be lower than utilities elicited when considering the perspective of a child [55], i.e. adults were less willing to trade-off length of life for children. This may also potentially occur for DCE with duration and standard gamble due to the risk of death, and may potentially occur because participants are more unwilling to state that a child should die sooner than to state that they themselves should die sooner.

Valuation studies for adult preference-based measures have been conducted using online surveys, computer-assisted personal interviews (CAPI), face-to-face interviews and hall tests across a range of different elicitation techniques. Table 2 highlights the use of classroom tests for adolescents. Appropriate design, framing and presentation can make a difference not only around the appropriateness of the task but also around the appropriateness of the mode of administration used to elicit preferences, and careful piloting is recommended.

3.4 Anchoring

Best–worst scaling, ranking and DCE do not automatically provide utilities that are anchored onto the 1–0 full health–dead scale (see Sect. 3.3 regarding the protocols that enable these methods to directly generate utilities on the 1–0 scale). This presents the key challenge of how to anchor these utilities onto the 1–0 full health–dead scale. Anchoring requires the use of utilities for the classification system that are anchored onto the 1–0 full health–dead scale, and these could be elicited using time trade-off, standard gamble or DCE with duration.

Possible methods for anchoring include mapping the ordinal preferences via regression analysis to cardinal utilities; rescaling using cardinal utilities for worst state/small numbers of states; and a hybrid model simultaneously modelling both ordinal and cardinal data [61] (to our knowledge, the hybrid model has not been currently applied to the valuation of child health states). Both the mapping method and hybrid model have been found to be more accurate at predicting time trade-off utilities when mapped from DCE preferences than the rescaling method [63]. The mapping method approach will simply anchor the ordinal preferences, whereas the hybrid model will simultaneously consider both the ordinal and cardinal data, and hence will produce utilities that combine the data. The selection of which method to apply may therefore depend upon whether the researcher or policy maker aims to generate combined preferences. For example, in the case of the elicitation of adolescent preferences, the mapping approach may be selected if adult preferences are obtained solely for the purpose of anchoring, rather than to generate combined value sets. The anchoring of utilities for child and adolescent preference-based measures in particular is an important area that has been underresearched and has not been fully debated to date.

4 Review of Methods Used to Generate Value Sets for Child- and Adolescent-Specific Generic Preference-Based Measures

Table 3 provides a summary of the value set methodologies of child- and adolescent-specific generic preference-based measures. Note that the AHUM, CHU9D, EQ-5D-Y, 16D and 17D are the only measures intended for use in children and/or adolescents; all of the other measures are also appropriate (and derived) for use in adults. For a more detailed overview of each valuation study of each measure see Chen and Ratcliffe [13].

Table 3 Summary of the value set methodologies of child- and adolescent-specific generic preference-based measures

There is no consensus in the methodology used in the valuation across the measures, for the CHU9D, HUI2 and EQ-5D-Y for valuations in different countries, and for the 16D and 17D across a suite of measures.

4.1 Whose Preferences

Adolescent preferences are solely used to generate value sets for AQoL-6D and 16D; adolescent preferences anchored using young adult preferences are used to generate CHU9D value sets in Australia and China; adult general population preferences are used to generate value sets for AHUM and CHU9D in The Netherlands and UK, EQ-5D-Y in the US, HUI2 in the UK, and the HUI3 and QWB; parent preferences are used to generate value sets for HUI2 in Canada and 17D.

4.1.1 Samples

Sample size ranges from 115 for the AQoL-7D to 4155 for the EQ-5D-Y. Some differences in sample size would be expected due to differences in the elicitation technique and mode of administration, as well as the choice of modelling and selection of health states for valuation. However, three samples were below 200 (HUI2 valued in Canada and the UK, and the 17D). Sample representativeness in terms of the approach used to ensure that the sample is representative of the population varies across studies. The 16D and 17D studies recruiting children and adolescents enrolled both school children and patients, while the CHU9D in China recruited only school children to form the adolescent sample and the CHU9D in Australia recruited a community-based sample via parents. Most of the studies involving the adult general population aimed to obtain national representativeness, with the notable exceptions that the AHUM recruited participants both by word of mouth and by an existing panel of potential participants; the sampling method was not specified for the AQoL-6D valuation. Three of the studies were published in 1996 (HUI2 Canada, 16D, 17D), one study was published in 2002 (HUI3), one study was published in 2005 (HUI2 UK), one study was published in 2008 (QWB), and the remainder were published from 2010 onwards. However, many of the valuation studies may have been conducted many years prior to publication, for example the HUI3 valuation was undertaken in 1994.

4.2 Perspective

Adolescent preferences are elicited using their own perspective. Valuation studies where parent preferences are elicited use the perspective of a 10-year-old child for the HUI2 in Canada, and a child aged 8–11 years for the 17D. Valuation studies where adult general population preferences are elicited use their own perspective for the AHUM and CHU9D in the UK and The Netherlands, and the HUI3, and use the perspective of a 10-year-old child for the HUI2 in the UK and the EQ-5D-Y in the US.

4.3 Elicitation Technique and Mode of Administration

There is considerable variation in the preference elicitation tasks used, with the AHUM and AQoL-6D using time trade-off; the CHU9D using different techniques in different countries with best–worst scaling and time trade-off, DEC with duration, and standard gamble; the HUI2 and HUI3 using standard gamble and VAS; the EQ-5D-Y using DCE with duration; and the QWB, 16D and 17D using a VAS. Adolescent preferences are elicited in a classroom setting and online survey, while adult preferences are elicited using face-to-face interviews and online surveys.

4.4 Anchoring

Most studies employ techniques that are directly elicited using conventional valuation approaches on the 1–0 full health–dead scale, except for the CHU9D in Australia and China. Both HUI2 value sets and the HUI3 value set apply a multi-attribute utility theory to combine standard gamble and VAS data.

5 Discussion

This paper has critically examined the methodological considerations involved in the valuation of child- and adolescent-specific measures, with reference to the methodological choices made to date in the valuation of child- and adolescent-specific generic preference-based measures. The approaches used to value existing child- and adolescent-specific generic preference-based measures are varied, with no commonality across the measures, or for some measures, within the choices made to value the measure in different countries. The sample size for some studies was small (the HUI2 in Canada [30] and UK [31], and the 16D [34]) given the size of the classification systems and the intended use of the valuation study to generate value sets for use to inform policy. Some of the value sets were published over 20 years ago [30, 34, 35] (the valuation studies underpinning these are likely to have been undertaken years earlier), and preferences may have changed over this time. Furthermore, there have been methodological advances in the health valuation literature. The methodological choices made to generate existing value sets indicate both what has been done and what is possible, yet there are many possibilities for future research around both what else could be done and the scope for recommendations around good practice. While many of the considerations are normative, meaning it is perfectly acceptable and expected that a range of approaches are used to generate existing value sets, both economic theory and empirical research can be used to generate good practice guidelines and maximise the quality of research in this area.

There is currently limited guidance from international agencies around how to generate QALYs, and hence utilities, for use in HTA of interventions affecting young populations. For example, while the NICE Methods Guide is prescriptive for the methods that should be used to generate utilities for adults, limited guidance is given around how to generate, source and model utilities for child- and adolescent-specific states. Recent reviews have found that child- and adolescent-specific preference-based measures have been used only a handful of times in HTAs submitted to NICE covering children and adolescents [64], as well as published cost-utility analyses for child and adolescent populations [65], and that a large range of diverse methods are used to generate published utilities for children and adolescents [66,67,68].

The limited use of child- and adolescent-specific preference-based measures to reflect the health and quality of life of children in HTA is concerning, since we are not aware of an evidence base demonstrating that adult preference-based measures (such as the EQ-5D-3L) appropriately and accurately capture the health and quality of life of children and adolescents. Evidence is required to examine the representativeness of adult measures self-completed by adults for their own health as a proxy for capturing the health of a child with the same condition, since this type of evidence has been used to inform HTAs [64]. In addition, evidence demonstrating head-to-head comparisons of adult preference-based measures, as well as child- and adolescent-specific preference-based measures, would enable greater understanding of the impact of using an adult or child- and adolescent-specific measure to measure the health of a child and adolescent.

The issue of comparability and consistency of utilities generated by child- and adolescent-specific preference-based measures and utilities generated by adult measures is important, since HTA utilities are modelled over time as the patient ages from childhood through adulthood. While it can be argued that the use of comparable valuation methodology for different preference-based measures can be used to ensure consistency when considering evidence generated using different measures (see, for example, Brazier et al. [69] for this argument around condition-specific and generic preference-based measures), this does not ensure comparability in the actual utilities that are used. This is important if utility changes as the patient ages due to a change in preference-based measures or from proxy to self-reporting despite no change in health.

The use of measures, such as the HUI2 and HUI3, that are appropriate for use across children, adolescents and adults have the advantage of consistency and comparability of utilities across all ages of patients. The combination of utilities generated using the EQ-5D-3L and EQ-5D-Y can also arguably provide some consistency in terms of the domains of health assessed, if it is appropriate to assume that domain content is the correct criteria of consistency. The CHU9D measure does have an adult version but use of this measure in adults can be questioned since the content of the classification system was developed with children aged 7–11 years [6, 17, 18].

It is unclear why child- and adolescent-specific preference-based measures have not been used to a larger extent to generate utilities for child- and adolescent-specific states. Potentially this could be for many reasons that are not mutually exclusive, including a concern around the psychometric performance of these measures; limited uptake of child and adolescent preference-based measures in trials or other studies used to generate data for use in HTA; concern around the appropriateness of existing value set utilities, methodology or, in the case of the EQ-5D-Y, lack of a value set; concern around the scope and focus of these measures and whether they capture all important outcomes for health and social care; or a concern around the use of these measures alongside adult utilities generated using an adult generic preference-based measure and how to combine these utilities. Another potential reason may be that less emphasis is placed on cost effectiveness when making resource allocation decisions for children and adolescents. In addition, the absence of recommendations for the use of child- and adolescent-specific measures in guidelines by international agencies is likely to be an important factor contributing to their limited use; developing these recommendations would encourage greater use of these measures and would be an important step forward.

In future, there are likely to be more child- and adolescent-specific generic preference-based measures, as existing child- and adolescent-specific generic measures are currently undergoing valuation in order to make them preference-based, including the PedsQL [70] (note there is also an adult version) and other measures that are amenable to valuation and that may be valued in the future, for example PROMIS [71]. At the time of preparation of this manuscript, the EuroQol Group is developing an international valuation protocol for the development of country-specific EQ-5D-Y value sets. This protocol has been informed by completed or in-progress studies funded by the EuroQol Group, which has investigated (1) whether current EQ-5D-3L value sets can be appropriately used with EQ-5D-Y health states [55, 62]; (2) the development of a latent scale value set in the UK using adults and adolescent samples [39, 47]; (3) the evaluation of different anchoring alternatives to latent scale value sets from DCEs [72]; and (4) the impact of using different perspectives when completing DCE with duration tasks to estimate an EQ-5D-Y value set.

The issue of measuring and valuing benefits for children and adolescents cannot be considered in isolation, since the impact of ill-health reaches wider than the child or adolescent, to other family members. There is important literature around the use of a family perspective in economic evaluation for children and adolescents to include spillover effects, and also around joint utility estimation [5, 8, 9, 73,74,75,76], and this is an area that deserves consideration by international agencies when they consider whether to make special recommendations around measuring and valuing health benefits in child and adolescent populations for economic evaluation.

The topic of this paper can be discussed in relation to welfarism and extra-welfarism. Welfarism has a clear theoretical position on whose preferences count in social choices, although, as far as we are aware, the literature does not have special considerations for children or adolescents. However, QALYs and cost-effectiveness analyses are grounded in extra-welfarism, and extra welfarism offers no such guidance. This means that the normative issues that we discuss in the paper require quite strong value judgements.

This review has examined the methodology around the valuation of measures aimed at measuring and valuing the health and quality of life of children and adolescents aged 5 years and above. There are added complications of generating utilities for children below 4 years of age, where none of the generic preference-based measures are recommended for use, meaning that there is little scope for the measurement and valuation of health and quality of life for children of this age as reported by carers/parents. There is a quality-of-life measure for infants and toddlers [77,78,79] —the Infant and Toddler Quality of Life Questionnaire (ITQOL)—but it is not preference-based. Valuation for health and quality of life for this age group would also present new challenges, since what is within a normal developmental range varies widely within the 0–4 years age range, and any generated utilities may need to capture impairment in comparison with the normal developmental range, rather than the normal developmental stage. For example, a newborn baby will not be able to walk or talk, but arguably should not have a utility decrement reflecting their inability to walk or talk, whereas a 4-year-old within the normal developmental range would walk and talk and any impairment would likely be associated with a utility decrement. Therefore, while QALYs can be used to capture health benefits for children aged below 4 years, the estimation of utilities to generate QALYs is far from straightforward.

6 Conclusions

This paper has summarised and critically assessed the methodological considerations involved in the valuation of child- and adolescent-specific measures, and reviewed the methodological choices made to generate value sets for child and adolescent generic preference-based measures. This paper has also identified gaps in research evidence and methods regarding the valuation of child and adolescent health states, in particular around the following.

  • Whose preferences The collection of experience-based utilities; the elicitation of patient preferences; possibilities for the combination of utilities elicited from adults and adolescents; whether there is a role and how to elicit informed preferences where child and adolescent experience can be used to inform elicitation tasks undertaken by adolescents or adults.

  • Perspective Whether the age and description of the child impacts on preferences elicited by adults valuing from the perspective of the child.

  • Elicitation technique Greater guidance around when consideration of being dead is both appropriate and acceptable for inclusion in tasks completed by adolescents, and how to ensure tasks are designed and framed appropriately for adolescents.

  • Anchoring Greater exploration of the anchoring of adolescent preferences using techniques applied in the valuation of adult preference-based measures.

The valuation of child- and adolescent-specific preference-based measures is a challenging area of research that warrants further empirical evidence to inform best practice guidelines. Many international agencies will have a view on this, and other stakeholders, including the general public, carers/parents and patients, and their views, as well as economic theory, will ultimately determine both the research agenda and what methodology is selected.