Introduction

Theoretical and empirical literature in the field of mental health supports the contention that absence of illness does not translate to mental wellness. Mental health measurements, however, have traditionally focused on lower levels of illness symptoms as representative of mental wellbeing. Interventions grounded in the fields of mental health promotion, positive psychology, and school based mental health, among others, necessitate the use of instruments suited to measure improvements in mental wellbeing without ‘ceiling effects’. Ceiling effects occur when the design of an instrument limits the range of scores in a positive direction and thus the opportunity to capture the full range of the phenomenon being measured (Wells et al. 2003). As the field of practice has moved to accommodate a focus on mental wellbeing, scales are needed to capture improvements in mental wellbeing where variance in average to good mental health can be accurately measured (Stewart-Brown 2002). These measures enable practitioners and researchers to effectively evaluate the impact of interventions designed to improve mental health at a population level (Bryant et al. 2015) and provide a more relevant direction for future program development (Keyes 2005a). Measuring mental wellbeing among adolescents requires instruments that are age-appropriate and acceptable for use in that population. A systematic search and appraisal of instruments is an essential part of this process.

For more than 60 years, scholars have attempted to define mental health in a manner that distinguishes it conceptually from the concept of mental illness (e.g., Vega and Rumbaut 1991) with consideration for variations in meaning based on culture or ethnicity (MacDonald 2006; WHO 2004). Historically, mental health was viewed as a proxy for the absence of mental illness (Smith 1959). Later research supposed that mental health incorporates “the absence of dysfunction in psychological, emotional, behavioral and social spheres… [and] optimal functioning or well-being in psychological and social domains” (Kazdin 1993, p. 128). Concomitantly, there has been a shift in policy arenas towards a definition of mental health that encompasses positive functioning. The U.S. Department of Health and Human Services (USDHHS) defined mental health as a “state of successful mental functioning, resulting in productive activities, fulfilling relationships, and the ability to adapt to change and cope with adversity” (2013, para 2). Similarly, the World Health Organization (WHO) stated mental health is “a state of well-being in which the individual realizes his or her own abilities, can cope with the normal stresses of life, can work productively and fruitfully, and is able to make a contribution to his or her own community” (2016, para 2). Both policy definitions underscore the more positive conceptualizations of mental health that move beyond solely the absence of illness, to incorporate aspects of effective functioning (WHO 2004). However, both of these exclude the feeling aspect of mental wellbeing. This was also the case with early models (Jahoda 1958) which focused mainly on functioning and included: (1) positive attitude towards self (2) sense of personal growth and development (3) integration of positive psychological functions such that one has the ability to deal with and resolve issues (4) autonomy leading to a level of independence from social influences (5) realistic perception of self and outside world and (6) adaptation to or mastery of environment. Later, Keyes (2002, 2005b) proposed a complete state model of mental health that encompasses both symptoms of positive feelings and positive functioning and emphasizes the social aspect of wellbeing. Based on this model, and using a continuous assessment of mental health, individuals can be classified on a continuum from completely mentally healthy (i.e., presence of flourishing and absence of mental illness) to pure languishing (absence of mental illness) to complete mental illness (i.e., presence of mental illness and absence of flourishing) (Keyes 2005b). Among adolescents, the dual-factor model of mental health (Greenspoon and Saklofske 2001) is grounded in the perspective that wellbeing and illness are separate but connected and both should be assessed to comprehensively measure mental health. This model has been used among children and adolescents and supports the idea that the absence of illness does not indicate mental wellbeing (e.g., Antaramian et al. 2010; Suldo and Shaffer 2008). Finally, consistent with the field of positive psychology, Seligman (2011) proposed the PERMA model, which encompasses positive emotions as well as engagement, positive relationships, meaning, and accomplishment.

The commonality among most of these models is in their assessment of areas of both feeling and functioning which is congruent with a general framework for mental wellbeing proposed by Ryan and Deci (2001). Based on this framework, mental wellbeing is a complex construct, grounded in both the hedonic and eudaimonic perspectives (Ryan and Deci 2001; Stewart-Brown 2017), which are complementary and when combined, best reflect the construct. Both hedonic and eudaimonic perspectives are rooted in Greek philosophy. Hedonic refers to feelings, or emotional wellbeing, and is manifested in the form of positive and negative affect and life satisfaction, for example. Feelings are perceived as a state of mind that may vary according to the situation, which oftentimes may be out of the individual’s control (Stewart-Brown 2017). Eudaimonic is related to individual functioning, both on a personal and social level (e.g., psychological wellbeing, social wellbeing). This form of wellbeing is achieved through the self-development of character traits and behavior (Stewart-Brown 2017). Individuals are described as functioning well, for example, when they have a sense of purpose and direction, are self-determined, and can form positive relationships with others (Ryff 1989). Social wellbeing, another key aspect of functioning, covers wider social relationships rather than close friends and family relationships and is evaluated using more public or social criteria, such as the degree to which an individual feels accepted by their communities and can acknowledge their contribution to society (Keyes 2002). The Ryan and Deci framework will be used to guide the assessment of instruments in the current review, in terms their preponderance of feeling and functioning. As evidenced by the literature, many terms have been used across disciplines and jurisdictions to reflect the positive side of mental health including mental wellness, mental wellbeing, and positive mental health. Though these terms are mainly synonymous, mental health is oftentimes viewed in the broader literature as encompassing both positive and negative aspects of mental health (Stewart-Brown 2017). Thus, for the purpose of this review, we use the term mental wellbeing.

There is a preponderance of early literature and scholarly work that focuses on mental health among adolescents from a deficit perspective. More recently, there is increased interest among the scientific community to engage in research around positive aspects of human nature (Rich 2003), particularly among children and adolescents (Furlong et al. 2014). This emphasis centers on the desired outcome, and as such, interventions focused on promoting better outcomes are regarded as more appealing, relevant, and beneficial to a larger proportion of the population, including those striving towards optimum levels of functioning (Rich 2003; Stewart-Brown 2017). Several fields have embraced this type of inquiry, including mental health promotion, positive psychology, and school mental health. Mental health promotion focuses on fostering an optimal state of wellbeing and enhancing strengths vs. solely risk reduction or the prevention of mental health problems (Dwivedi and Harper 2004; Weisz et al. 2005). This approach is consonant with the science of positive psychology, which seeks to understand how people thrive outside of the context of adversity and pathology (Seligman and Csikszentmihalyi 2000). The positive psychology movement emphasizes a critical focus on fostering human capacity, through strengthening and building positive qualities that exist within each individual, in addition to the well established disease based view of human functioning (Gable and Haidt 2005; Seligman and Csikszentmihalyi 2000). Concepts from positive psychology like initiative and interest have been found to have particular relevance for positive youth development (Larson 2000).

Schools are also a natural setting for the implementation of interventions to promote mental health among adolescents (Weare and Nind 2011; Weist 2005). School-based programs can be targeted at youth expressing symptoms, youth at risk, or all students in the school (e.g., universal interventions). The latter, in particular, can potentially reduce stigma and are particularly suited to focus on mental wellbeing (Wells et al. 2003). Indeed, positive evidence of effectiveness has been reported for universal programs aimed at promoting mental health as opposed to preventing mental illness (Wells et al. 2003) and interventions that focused on positive aspects of mental health rather than being problem-based (Weare and Nind 2011).

Measuring mental wellbeing can be challenging as it relies on an individual’s own perceptions of indicators such as life satisfaction and happiness. Children or adolescents may not yet be able to fully conceptualize these constructs, which would significantly influence their responses to the measure. They may also have difficulty completing evaluation instruments due to low literacy, inattention, and desire to give the “right” answer (i.e., social desirability bias; Bryant et al. 2015). Thus, instruments developed for adults may not be linguistically and conceptually appropriate for children and youth (Schmidt et al. 2001). Further, even among adolescents, there may be within group differences in developmental understanding and experiences of younger vs. older adolescents. For example, younger adolescents tend to be less future oriented and desire more immediate gratification (Steinberg et al. 2009), and instruments may need to be more sensitive to these growth nuances when assessing aspects of mental wellbeing, such as life satisfaction. Additionally, differences in constructs like life satisfaction by SES (Ash and Huebner 2001) and race (Huebner et al. 2004) suggest the importance of attending to cultural and racial/ethnic variations in measurement, particularly among youth as they are still developing a sense of identity, which may include racial identity. Notwithstanding, many instruments have been used to measure mental wellbeing among adolescents. These measures are important for including the child’s perspective in decision-making and assessing the effectiveness of interventions, both at the individual and population levels (Schmidt et al. 2001).

Instruments that present a more balanced picture of youth mental wellbeing are critical to the development and evaluation of programs (e.g., school-based) designed to improve mental wellbeing among youth (Tennant et al. 2007a, b). Further, instruments that include a range of indicators are less commonly identified (Furlong et al. 2014). The current review aims to add to the knowledge about these available instruments. Accordingly, this review will identify instruments that measure mental wellbeing among adolescents and examine the instruments in terms of (1) content (positively worded items, alignment with the Ryan and Deci (2001) mental wellbeing framework); (2) conceptual relevance for youth including psychometric properties (particularly validity with youth); and (3) responsiveness to change. Within these contexts we also make recommendations for their use among adolescents.

Method

Procedure

We used systematic review methods proposed by Littell et al. (2008) and procedures for preparing a review outlined in the preferred reporting items for systematic reviews and meta-analyses (PRISMA) guidelines (Moher et al. 2009). The research team consisted of the lead author; two senior scholars with extensive experience in conducting systematic reviews, instrument development, and research on mental health; and two graduate level research assistants. All team members were involved in the planning stages and development of the framework for the review. Librarians with expertise in conducting systematic reviews were then enlisted to assist in the development of search strategies. The lead author and one graduate student met weekly during the initial review process to establish inter-rater reliability. Throughout that phase, senior authors provided input on an as needed basis. During the instrument selection phase and writing of the review, all research team members communicated regularly through Skype, teleconferences, and email.

Search Strategy

In consultation with librarians, the lead author developed a list of search terms and concepts to identify appropriate literature to locate measures that evaluate mental wellbeing among adolescents. This list was shared with the research team for feedback and additional suggestions. Librarians then developed specific search strategies for each database used in the review. Databases searched included SocIndex (EBSCO), PsycInfo (EBSCO), Web of Science (Thomas Reuters), PsycTests (EBSCO), Cochrane Central Register of Controlled Trials (Wiley), ERIC (EBSCO), Social Work Abstracts (EBSCO), and CINAHL (EBSCO). Terminology that captured the concept of mental wellbeing included mental wellness, wellbeing, and toughness; emotional wellness, wellbeing, health, and functioning; psychological wellness, wellbeing, health, and functioning; positive affect; and flourishing. The concept of youth was searched using terminology including adolescent, youth, teenagers, and young person. To locate tests and measures we searched concepts including but not limited to tools, instruments, batteries, and inventories. Truncation, proximity, and title/abstract searching was applied to database searches when appropriate (please see supplementary materials for our search terms by database including Boolean operators). As the review originally began in 2014, an initial date range from 1998 to 2014 was applied to all database searches. This included a 15-year search limit, plus a year, suggested by the librarians, to ensure the inclusion of potential sources published in late 1998. The search frame was set to manage the number of possible citations, and to elucidate the most recent research on mental wellbeing. It allowed the identification of instruments currently being used to measure mental wellbeing among adolescents, regardless of the date the instrument was developed. Due to the length of time to complete the review, the librarians completed an updated search in 2016, using the same search criteria, to ensure the most recent sources would not be excluded. All searches were completed in February 2016.

Grey literature searches were developed by librarians and were completed by March 2016. Due to search capabilities in individual grey literature resources, the scope of searches was limited to keyword searching. Librarians shared these search strategies with the lead author, which allowed for evaluation of grey literature. The World Health Organization, U.S. Department of Health and Human Services, American Psychological Association, National Center for Health Outcomes Development, The Search Institute, Child Trends, National Institute of Health, Google Scholar, and the Substance Abuse and Mental Health Services Administrations (SAMHSA) were all searched to locate grey literature.

Inclusion and Exclusion Criteria

Though the focus of our review was to elucidate instruments used to measure mental wellbeing among adolescents, inclusion and exclusion criteria were developed at the level of the empirical study record first (which included the abstract and other information about the study) and then at the level of the instrument. Evaluation of the instruments was based on meeting both sets of criteria outlined below.

Article/record citations

Screening forms with eligibility criteria were developed in Excel and completed for all studies retrieved as potentially eligible. To establish inter-rater reliability, two reviewers (lead author and one research assistant) reviewed 100 records independently and compared notes on inclusion and exclusion. This process continued until 90% reliability was established using Cohen’s D, after which the remainder of the study records were divided, reviewed independently, and eligibility was determined. All information was noted on spreadsheets and saved in Refworks folders. Any identified disagreements were resolved through discussion and input of other research team members. This first level of inclusion and exclusion was primarily aimed at ruling out empirical studies that were not consistent with our focus on mental wellbeing (i.e., focused on mental illness outcomes such as depression). Study records were retained if the study (1) was published within the time frame 1998–2016; (2) focused on or included adolescents between the ages of 12 and 18; (3) was written in English; and (4) identified a specific mental wellbeing measure. Records were excluded if the study (1) was published before 1998; (2) focused solely on children (under 12); (3) focused solely on adults (over 18); (4) was not written in English; (5) did not identify a specific mental wellbeing measure; and (6) focused solely on mental illness (e.g., depression) as an outcome.

Instruments

After ruling out potential empirical study records that were not relevant to the review, instruments were then selected from remaining records. Instrument inclusion was based on the following criteria: (1) at least 50% positively worded items (2) positively worded items were scored to reflect mental wellbeing (i.e., not reverse scored), and (3) included at least one item that assessed feeling (i.e., emotional wellbeing) and one item that assessed functioning (i.e., psychological wellbeing, with or without social wellbeing). Instruments were excluded if (1) they had a measure with less than 50% positively worded items, (2) positive items were reverse scored solely to indicate a negative wellbeing outcome, (3) they included areas of wellbeing that were not the focus of this review (e.g., material wellbeing), and (4) they were dimension specific (e.g., assessed self-esteem or life-satisfaction only). This latter suggestion is consistent with the criteria suggested by Schmidt et al. (2001), when assessing multi-dimensional instruments. Eligibility of instruments for inclusion in the review was documented using a separate Excel spreadsheet.

Results

Search Results

The initial search identified 5423 records including 16 records from the gray literature search. After duplicates were removed the total number of records screened was 4752. After screening and based on article inclusion criteria, 689 study records or reports were included for full review. The most common reasons for exclusion at this point of the review were articles that focused on a mental health problem, predictor or outcome and focused solely on children under 12, young adults, or adults (e.g., college students). Additionally, articles that were not written in English or included instruments that were in another language besides English were excluded. The remaining 689 studies were then fully reviewed to determine if instruments used to measure mental wellbeing met instrument inclusion criteria. Review of the description of the measures in each study revealed 12 potential instruments. Though instruments with both positively and negatively worded items are included, we include the criteria of greater than 50% positively worded items and positively scored vs. reverse scored to ensure consonance with our construct of mental wellbeing. Further examination of the actual instruments and communication with an instrument developer limited the list to 11 as one scale was only developed for use in the context of a particular study, thus not developed with psychometric properties. A PRISMA flow diagram (Fig. 1; Moher et al. 2009) was completed to document final numbers for the review, instrument inclusion, and the overall process.

Fig. 1
figure 1

Data extraction of the studies. Based on PRISMA flow diagram, Moher et al. 2009

General Overview of Instruments

Five scales were developed solely in the United States (US) with one scale (EPOCH Measure of Adolescent Wellbeing) implementing scale development studies in both the US and Australia. The other five scales were developed in Denmark (WHO-5), United Kingdom and Scotland (Warwick-Edinburgh Mental Well-Being Scale Original and Short Versions), New Zealand (Affectometer 2), and India (PGI Well-being Scale). Within these countries, instrument development took place among individuals in various settings including schools, communities (e.g., participants homes, organizations), medical settings, and on the phone. Instruments ranged from five items (WHO-5) to 100 items (child and adolescent wellness scale (CAWS)) and instruments took as short as 3 min and as long as 25–30 min to complete, dependent on instrument length. Table 1 provides a snapshot of the instruments including the relevant citation. Four of the scales were developed for adolescents. Though the scales varied in their preponderance of items, all scales encompassed indicators that assessed at least one indicator of both feeling and functioning. Five of the 11 scales were 100% positively worded.

Table 1 Snapshot of mental wellbeing instruments

A brief summary of each instrument is provided below:

Affectometer 2

The Affectometer 2 is designed to assess a range of aspects related to positive mental health, including subjective wellbeing, psychological functioning, and relationships. The 40 items are based on the balance of positive and negative feelings and functioning in a recent experience and covers ten categories: confluence, optimism, self-esteem, self-efficacy, social support, social interest, freedom, energy, cheerfulness, and thought clarity. Half of the items are positively worded and half negatively worded, and responses are on a 5-point scale ranging from “not at all” to “all of the time.” It requires 5 min to complete. Scoring is calculated by subtracting the sum of negative items from the sum of positive items. The total score may range from −80 to +80.

Child and adolescent wellness scale (CAWS)

The CAWS assesses positive psychological health among school-aged children (grades 6–12) within ten domains: empathy, connectedness, self-efficacy, adaptability, initiative, conscientiousness, social competence, optimism, emotional self-regulation, and mindfulness. Items representing both functioning (e.g., I don’t give up easily, I am determined) and feeling (e.g., I often feel hopeless, I am cheerful) are included. It is a 100-item self-report questionnaire, including 11 reverse-scored negative items, and requires 25–30 min to complete. An earlier version of the scale contained 150 items. Responses are on a 4-point scale with options ranging from “not at all like me/strongly disagree” to “very much like me/strongly agree.” The average is then taken of the scores for each dimension.

EPOCH measure of adolescent well-being (EPOCH)

The EPOCH is a self-report questionnaire that assesses adolescent positive psychological functioning and feeling. It was developed for adolescents from the PERMA Profiler, which measures flourishing among adults. The measure includes five positive psychological traits, including engagement, perseverance, optimism, connectedness, and happiness. Respondents rate 20 items on a 5-point scale, ranging from “almost never” to “almost always.” Scores are calculated for each of the five domains. Time for completion was not available.

Friedman well-being scale

The Friedman well-being scale is a self-report questionnaire which measures overall psychological well-being. It has five sub-scales: emotional stability, self-esteem/self-confidence, sociability, joviality, and happiness. It includes 20 bi-polar adjectives with “very,” “moderately,” and “neither” self-report options. It requires 2–3 min to complete. Respondents rate their typical feeling with respect to each adjective pair, such as angry/calm, on a scale of 0–10, where 0 indicates “very” for the negative adjective and 10 indicates “very” for the positive adjective. Scores may be calculated for each of the five subscales as well as for an overall measure of wellbeing called the Friedman well-being composite.

Mental-health continuum-short form (MHC-SF)

The MHC-SF is a self-report questionnaire adapted from the Mental-Health Continuum-Long Form. It includes three items representing emotional wellbeing, six items representing psychological wellbeing, and five items representing social wellbeing. All 14 items are positively worded, and are scaled on six responses ranging from “never,” to “everyday.” The sum of the ratings is taken to calculate the score, which may range from 0 to 70. Time for completion was unavailable.

PGI well-being scale (PGI)

The PGI is a short, a self-report questionnaire that measures subjective wellbeing or positive mental health. It consists of 20 positively worded items, within four domains: physical, anxiety, mood, and self/others. Feeling (e.g., feeling happy in life) and functioning (e.g., interested in life a good bit of the time) items are included. Questions ask participants to tick items applicable to them in the last month. Items are rated on a 4-point scale ranging from “not at all” to “all of the time.” Higher domain and total score indicates higher wellbeing. It requires five to 8 min to complete.

Ryff scales of psychological well-being (RYFF)

The RYFF measures multiple facets of psychological wellbeing including self-acceptance, positive relations with others, autonomy, environmental mastery, purpose in life, and personal growth. It also includes items such as ‘In general I feel confident and positive about myself’ and “When I look at the story of my life, I am pleased with how things have turned out” which reflect the feelings component of mental wellbeing. It is a self-report or interviewer-administered questionnaire consisting of 84 items (original scale), 54 items, or 18 items. Respondents rate statements on a 6-point scale ranging from “strong disagreement” to “strong agreement.” Scores are calculated by reversing ratings on negatively worded items, then summing the degrees of agreement on the items corresponding to each of the six dimensions. Time for completion was unavailable.

Social and emotional health survey (SEHS)

The SEHS is a self-report questionnaire designed to measure core psychological components of covitality, representing a general index of youth positive mental health. It includes four subscales: belief in self, belief in others, emotional competence, and engaged living. There are a total of 36 positively worded questions assessing feeling (e.g., since yesterday how much have you felt grateful) and functioning (e.g., I accept responsibility for my actions). For 30 of the questions, respondents rate statements on a 4-point scale ranging from “not at all true of me” to “very much true of me.” The remaining six items are rated on a 5-point scale ranging from “not at all” to “extremely.” Scores are calculated for each of the four subscales by totaling the ratings on corresponding items. The overall covitality score ranges from 36 to 150, with 85 or under considered low. Time for completion was unavailable.

Warwick-Edinburgh mental well-being scale (WEMWBS)

The WEMWBS is a 14-item self-report questionnaire that measures mental wellbeing, including emotional wellbeing and psychological functioning. All items are worded positively and address aspects of positive mental health. Together they cover most, but not all, attributes of mental wellbeing including both hedonic and eudaimonic perspectives. Areas not covered include spirituality or purpose in life. Respondents rate each item on a 5-point scale, ranging from “none of the time” to “all of the time.” Time for completion was unavailable.

Warwick-Edinburgh mental well-being scale—short version (SWEMWBS)

The SWEMWBS is a shortened version of the WEMWBS. It is a 7-item self-report questionnaire that measures mental wellbeing, covering emotional wellbeing and psychological functioning. All items are worded positively and address aspects of positive mental health. Respondents rate each item on a 5-point scale, ranging from “none of the time” to “all of the time.” Time for completion was unavailable.

WHO-5 well-being index (WHO-5)

The WHO-5 consists of five items that measure current wellbeing, including functioning (e.g., my daily life has been filled with things that interest me) and feeling (e.g., I have felt cheerful and in good spirits) items. Responses are on a 6-point scale ranging from “all of the time” to “at no time.” Ratings are summed to calculate the score; higher scores indicate better wellbeing, and a score below 13 indicates the need for testing for depression. Time for completion was unavailable.

Psychometric Validation and Sensitivity to Change

All scales had acceptable reliability and validity, based on psychometric properties established in original scale development, and good reliability in studies assessing mental wellbeing among youth. Table 2 shows the psychometric properties of instruments that were validated for use with youth, based on empirical studies identified through this review. Validation is important to establish as it directly relates to the scales contextual relevance for youth. Our findings revealed that the Affectometer 2, Friedman Wellbeing Scale, and SWEMWBS have not specifically been validated among youth. It is important to note that though there hasn’t been separate validation studies of the SWEMWBS, the items on the scale have been cognitively tested with youth through validation studies of the parent scale (WEMWBS; see below). The four scales that were developed for youth (CAWS, EPOCH, PGI, and SEHS) reported criterion and/or construct validity and used theoretical or empirical literature as the basis for scale development. The validity reported for the CAWS was based on the 150-item version of the scale (Copeland et al. 2010). Evidence of validity among youth was identified for the MHC-SF (Keyes 2006), WEMWBS (Clarke et al. 2011), and WHO-5 (Allgaier et al. 2012). Importantly, factorial validity identified for the RYFF scales were using shorter (i.e., 18-item) or adapted (e.g., 30-item; Fernandes et al. 2010) versions of the scale. Cognitive testing was only reported with one scale; WEMWBS researchers (Clarke et al. 2011) used focus groups to assess acceptability and comprehensibility among adolescents in their validation study.

Table 2 Validity of scales among youth

Another key aspect of measurement is its responsiveness or sensitivity to change. Three scales were used in intervention studies with youth: RYFF (e.g., Eniola and Ajobiewe 2013), WEMWBS (Huppert and Johnson 2010), and SWEMWBS (Manicavasgar et al. 2014); all detected change from pre to post-test. Additionally, the original version of the CAWS (150-item) was used in a dissertation study (Molina 2008) and found to detect change from pre-test to post-test.

Discussion

The increased focus on mental wellbeing among adolescents necessitates the use of validated scales, inclusive of positive feeling and functioning indicators, to assess improvement in wellbeing without ‘ceiling effects’ (Wells et al. 2003). As the field of mental health has moved to more fully accommodate mental wellbeing, practitioners and researchers in this field (e.g., mental health promotion, school based mental health) can use these scales to effectively evaluate the impact of interventions and programs designed to promote mental wellbeing (Bryant et al. 2015). Through the review, we identified 11 instruments used with youth that encompass both indicators of feeling and functioning, in the Unites States and internationally. We discuss the scales further below focusing on reflection of mental wellbeing content (including alignment to the Ryan and Deci (2001) framework), conceptual relevance among youth, and sensitivity to change. Recommendations are also made for use of the instruments to measure mental wellbeing among adolescents.

Reflection of Mental Wellbeing Content

Mental wellbeing, as a construct, is still not consistently defined, even though progress has been made as evident in varying definitions and models presented earlier in this review. Specifically, there remains some debate about what domains are relevant to measuring mental wellbeing especially with regard to ‘feeling’ and ‘functioning.’ Feelings are perceived as a state of mind, subject to change dependent on the circumstance (Stewart-Brown 2017), and commonly reflected in the form of affect or life satisfaction. Functioning, either on a personal or social level, can be achieved through the development of individual character traits and behavior (Stewart-Brown 2017) and are manifested in concepts like sense of purpose or worth and integration into social environments. The distinction between feeling and functioning may be challenging in practice because some items incorporate elements of both. For example, confidence is a feeling but confidence can also be developed by acquiring skill in recognizing and challenging negative beliefs about self and thus can be also considered as functioning. Consequently, items such as, “In general I feel confident and positive about myself” (RYFF scale), could potentially be viewed as either feeling or functioning, thus illustrating the inter-related nature of the two dimensions. Optimism can be similarly held to be a feeling and a learned behavior or aspect of functioning. Gratitude and appreciation are both feelings and at the same time positive skills that can be cultivated (i.e., functioning). Feeling active and energetic, though feelings, notably sit on the cusp of physical and mental wellbeing. For the purpose of this review we classified these items (e.g., confidence) as ‘feeling’ and thus our list included a scale known for measurement of psychological functioning (RYFF) and another that, at first examination, appears to reflect more social and psychological functioning (SEHS). Hence, decisions about whether an item reflects feeling or functioning is based on subjective interpretation and we fully acknowledge that other researchers may utilize different benchmarks for their assessment of the scales included in our review.

Accordingly, all 11 instruments included at least one item relating to feeling and one item relating to functioning. Five instruments were predominantly (80–99%; CAWS, RYFF, SEHS, SWEMWBS) or mainly (60–79%; MHC-SF) functioning. Three scales were predominantly (80–99%; WHO-5) or mainly (60–80%; Affetometer-2, Freidman Well-being Scale) feeling. The final three scales (EPOCH, PGI, WEMWBS) were somewhat balanced in their inclusion of feeling and functioning items. Within the functioning items, scales varied in their social content from belongingness (CAWS) to societal contributions (MHC-SF). Some of the included scales (e.g., WHO-5) did not cover the construct of social wellbeing. Results suggest the importance of choosing a scale that best reflects the aspects or domains of wellbeing being assessed or in alignment with a particular theoretical and intervention model. It is also significant to note that a quick description of the scale or instrument name may not be fully inclusive of the actual items being measured. For example, the Affectometer 2, though labeled as instrument to measure affect, actually assesses aspects of functioning. Thus, examining the actual items of the instrument is important to ensure that the scale is consistent with the goal of the research. Across the scales, there was some consistency regarding aspects or domains of mental wellbeing measured. The most common items measured (found in five or more scales) included: feelings about self (e.g., esteem, acceptance), self-efficacy, freedom/autonomy, cheerfulness/joviality, social ties and relationships, emotional stability, engaged living, happiness, and purpose in life. Other common items included optimism, social competence, energy, life satisfaction, and relaxation. Summarily, scales are measuring psychological, emotional, and social aspects of mental wellbeing. A few scales, such as MHC-SF and CAWS, incorporated all three constructs.

It is critical to note that during the course of the review, we observed that many scales designed to assess negative feelings (e.g., depression) were utilized to represent mental wellbeing by their absence. These measures on the whole did not assess functioning aspects of the construct. Further, instruments used to measure some but not all of the relevant and integral feeling aspects of mental wellbeing such as life satisfaction (e.g., SLSS) and affect (e.g., PANAS) appeared often during initial stages of the review, but were excluded because they did not include functioning items. Typically when these measures were used in measurement of mental wellbeing, additional instruments (e.g., self-esteem) were also used alongside to capture other aspects of mental wellbeing (e.g., functioning). Similarly, quality of life instruments (e.g., Lancashire Quality of Life Profile) were also identified in the review but assessment showed that the scale included other wellbeing constructs not related to mental wellbeing such as material wellbeing. Accordingly, these types of instruments were not included in the list of instruments based on the current review’s goal of identifying instruments that covered feeling and functioning aspects of mental wellbeing, solely.

Instruments in our review included at least 50% positively worded items, scored as such to be consistent with our conceptualization of the construct, and with more recent definitions of the term. This criterion led to the exclusion of instruments initially considered for inclusion in the review such as the Mental Health Inventory (Viet and Ware 1983) and the Psychological Well-being Measure (van Bergen et al. 2008). Though we acknowledge that mental wellbeing can be assessed with both positively and negatively worded items, particularly to avoid response bias, we also recognized a preponderance of research that describes or contextualizes mental wellbeing based on the reciprocal scoring of negatively worded items. This, in turn, may lead to the elucidation of very different instruments as well as those subject to ceiling effects. Further, as we have previously established that absence of disorder does not equate to mental wellbeing, positively worded items may be succinctly different in their generation of responses compared to negatively worded items (Kern et al. 2015). For example, research has shown that interpretation can be qualitatively different for positive vs. negatively worded items, varying for one set of items vs. the other (Borgers et al. 2004). There is also some indication, at least in adults, that study participants prefer positive measures (Crawford et al. 2011) and theoretical reasons to think that positively focused instruments better support positively focused interventions (Stewart-Brown 2017).

Conceptual Relevance Among Youth

Research contends that instruments developed and normed for adults may not be reliable and valid among youth, due to their unique developmental stage and consequent conceptual understanding of identified constructs. Of the 11 instruments, only four were originally developed for use among adolescents (CAWS, EPOCH, PGI, and SEHS). These scales had strong theoretical and/or empirical foundations and initial psychometric properties (i.e., reliability and validity) reflected relevance for use with youth populations. The other seven instruments were developed for use among broader populations, mainly adults, with psychometric studies conducted in some cases to assess suitability for use among adolescents. The results of the review evinced good reliability or stability for all seven of these scales among adolescents. For the RYFF, research on the factor structure of scales among adolescents suggested consistency with domains previously identified in the original development of the scale, though the confirmation was for shorter or adapted versions of the scale. Psychometric studies for MHC-SF, WEMWBS, and WHO-5 confirmed the scales’ validity (e.g., construct, criterion) among youth. For the three remaining scales, no other psychometric properties among youth were available outside of reliability (i.e., Affectometer 2, Freidman Well-being Scale, SWEMWBS).

Validity of scales among youth is key to ensuring that the instrument items are conceptually relevant to that population. Thus, use of scales not validated among youth may require additional psychometric testing to determine suitability for use among that population. Importantly, only one of the eleven scales (WEMWBS) included cognitive testing to confirm acceptability and comprehensibility among youth (Clarke et al. 2011). This is particularly relevant for scales developed for adult populations as they may require semantic adaptation through qualitative methods to strengthen conceptual relevance among youth (Fernandes et al. 2010). Thus, an important recommendation for scales being used with adolescents, particularly for those developed for adults, is to include cognitive testing as part of the scale validation process to ensure items are reflective of youth understanding of measured constructs. When using a scale, it is also critical to ensure conceptual relevance and understanding of scales across diverse groups of adolescents (e.g., age group). Developmentally, younger adolescents may have a different conceptual understanding of certain constructs and this may be reflected in their choice of response. Further, as mental wellbeing can vary by culture, scales developed in one country may need to be assessed for cultural relevance among diverse youth within the country as well as outside of the country (Tennant et al. 2007a, b). This is critical as group differences in responses, for example, may be reflective of inconsistencies in the psychometric properties of instruments or may not be evident due to invalid measurement (Chen 2008). Thus, when using a mental wellbeing instrument, it is important to ensure that it is validated across cultural groups, age groups, and gender groups (e.g., Fernandes et al. 2010). Though this work is evident in some of the scales (e.g., SEHS validation across cultural groups, gender, and age groups; Furlong et al. 2014; You et al. 2015), it is important to ensure this type of validation precedes use or is concurrent with use of any of the scales among diverse subgroups of youth.

Sensitivity to Change

An important aspect of measuring mental wellbeing in intervention studies rests in its ability to detect change over the course of measurement time points (Guyatt et al. 1987). As more interventions are assessing mental wellbeing among adolescents, it is important to delineate a scale’s responsiveness to change. Results of our review did not include many intervention studies. However, the CAWS, WEMWBS, and SWEMWBS were used in studies that examined change over time using at least a one-group design. Results of these studies indicate that the scales were able to detect positive change in mental wellbeing from preintervention to postintervention. Importantly, we identified only one scale (WEMWBS) where sensitivity to change analyses have been conducted and detected at the group or individual level (Maheswaran et al. 2012).

General Recommendations for use of Current Scales

Many of the scales, according to our findings, had been used in limited studies with adolescents. This may be due to more recent development (e.g., EPOCH), lack of validation of scales among youth, and the availability of the measure in the public domain. To this last point, most of the scales were free to use and in the public domain or needed developer permission/acknowledgement. For one scale, availability was unclear (PGI) and for another, there was a cost to use (Freidman well-being scale). Consistent with the goal of the research and funding restraints, it may be important to identify scales that are free to use or free with developer permission. Similarly, our research evinced studies that utilized different versions of the scale, which varied by length. Decision-making about length should consider burden to the youth participants, time restraints, consistency with goal of the research, fidelity to factor structure of scale, use of a psychometrically tested version, and cultural relevance to the population being studied. A final recommendation is based on the setting. Earlier we stated that the purpose of the research is important to choosing a scale that may be reflective of the aspect of mental wellbeing assessed. It is also important to choose a scale based on how it will be used (e.g., clinical, intervention, surveillance). For example, the Friedman wellbeing scale was designed particularly for use in clinical assessment of individual change (P. Freidman, personal communication, March 30, 2016). The SEHS is highlighted as a school-based surveillance survey, assessing positive indicators of mental health functioning among youth (You et al. 2015). Thus, the setting has implications for the type of scale that should be used to measure mental wellbeing.

Limitations and Future Research

Though our review identified 11 instruments that measured mental wellbeing, most of the scales were used minimally in articles among adolescents, making comprehensive assessment of the appropriateness of the scale in general and for sub-groups of adolescents challenging. Additionally, other well-known instruments that measure multiple domains or aspects of mental wellbeing relevant for adolescents but not yet used or validated with this group were excluded from this review. For example, the Comprehensive Instrument of Thriving (CIT; Su et al. 2014) was identified through the grey literature search but subsequent examination of the peer-reviewed and grey literature, within our search timeframe, did not elucidate studies indicating use among adolescent populations. Further, some studies included in the review were not fully consistent with the 12–18 age range we identified. That is, a few studies included adolescents along with young adults, such as those that utilized a different age range for their definition of adolescents (e.g., 15–24). Thus, our findings aren’t limited solely to the 12–18 age range. Also, the literature search may not have identified all the scales used in non-published studies, such as program evaluations, so a comprehensive assessment of sensitivity to change of the instruments was not possible. Importantly, the Ryan and Deci framework is one model of instrument evaluation and may not be applicable to all research on mental wellbeing. Thus, we identified studies using well-known and validated instruments to operationalize mental wellbeing, which did not meet our inclusion criteria, but might have met those of other researchers. Notwithstanding, to our knowledge this review is the first to systematically identify and examine characteristics of instruments measuring both feeling and functioning domains of mental wellbeing in a positive way, among youth, making an important contribution to the adolescent mental health literature. Priorities for future research include further validation of identified mental wellbeing scales among adolescents, particularly in regards to cognitive testing and determining cultural and conceptual relevance for subgroups of adolescents. Additionally, validation of other comprehensive mental wellbeing scales (e.g., CIT) not included in the review should be implemented among adolescent populations.

Globally, promoting mental wellbeing among adolescents is of great public health and social significance. However, as promoting mental wellbeing becomes critical to the field of practice, practitioners need access to relevant and acceptable measures that comprehensively evaluate improvement in wellbeing without ‘ceiling effects’ in contrast to traditional scales that assess mental illness or poor wellbeing. A variety of scales have been validated and used with adolescents offering a choice in terms of overall length, generation of subscales, balance of items relating to feeling and to functioning, and the inclusion or not of relational, social, or emotional dimensions. Implications include the importance of considering the setting for use of the scale, validating adult-developed instruments for youth, and ensuring the instrument’s cultural and conceptual relevance within groups of adolescents.