1 Introduction

Although research into children and adolescents’ self-reported well-being is underdeveloped in comparison with the same kind of research among adults, various different instruments are already employed in scientific research into the younger population (Huebner 2004).

However, very few empirical tests have been published with the results obtained from different instruments in different countries or cultural contexts – with a few exceptions, most of the research results available are from the Anglo-Saxon world.

In order to compare children’s well-being across countries there is an increasingly more apparent need for indicator systems to include relevant subjective indicators that provide information regarding children’s self-assessment of their lives. Initial testing, like that promoted by UNICEF with Report Card 7 (Adamson 2007), has stimulated significant international debate, which reflects the potential interest in this field. That said, one of the key points in the debate is that the use of a one-item scale as the only indicator of children’s subjective well-being (e.g. Cantril’s leader) is too weak a solution to the challenge.

Although a number of research publications use multiple-item scales, there remains a need for more studies to analyze the reliability and relevance of the various instruments available for collecting children’s self-assessments across different cultures, languages and countries in order to use the results as subjective indicators in the international arena.

This paper presents data taken from different instruments for measuring adolescents’ self-reported well-being. At least four of the instruments are widely-known and have often been used on adolescent populations in different countries. One of the instruments is specifically designed for children and adolescents aged 8 to 18. Two of the instruments were initially designed for adults, but have been used with adolescents in different countries, and previous results suggest they are well understood and perform well with populations aged 12–18. The other well-known instrument we are using is a single-item scale on overall life satisfaction, which following Campbell, Converse and Rodgers’ (1976) proposal has become commonly used in all kinds of research in this field, although with different wordings.

Besides the well-established cultural response bias to self-reported well-being scales (Diener and Suh 2000; Diener and Tov 2007), studies carried out on adult populations suggest that there are cultural differences in the perception and understanding of the concepts of “satisfaction” and “happiness” (Lau et al. 2005). These cultural differences have not been explored in any depth among children and adolescents, but preliminary research, both quantitative and qualitative, suggests that children from different countries and different ages may not understand the meaning of overall life “satisfaction”, “happiness” or “contentment” in the same way (Cummins and Lau 2005; Camfield 2006).

The data presented here were collected in four Latin-language speaking countries: Argentina, Brazil, Chile, and Spain. Results which compare two Latin-speaking countries - Spain and Romania – using the PWI and OLS have already been presented at an IQOLS conference (Casas et al. 2009).

Although Spanish is an official language in Spain, Argentina and Chile, understood by the majority of the population, all scales had to be adapted to each country’s specific use of the language. In some cases, additional changes have also been required due to the socio-cultural understanding of different situations in each country from a child’s point of view. The questionnaires were administered in the Catalan language in the autonomous community of Catalonia, Spain. In Rio Grande do Sul, Brazil, the questionnaire was administered in the variation of Portuguese spoken in that region. The specific wording of each instrument in each language and country and their back translation into English can be found at http://www.udg.edu/eridiqv (click “instruments”).

The aim of this paper is to explore the subjective well-being of 12 to 16-year-old adolescents in 4 countries, testing different measures of well-being and analyzing the different results obtained with each measure in each different socio-cultural context. Comparability of measures in different countries will be tested using multi-group Structural Equation Modeling. Additionally, the comparative performance of each scale will be analyzed in order to ascertain whether some of the scales are more appropriate than others for use as social indicators of adolescents’ subjective well-being in the international arena.

2 Methodology

2.1 Instruments

Three different multiple-item scales and seven single-item scales were tested on adolescents in the 4 countries. The scales are the PWI (Cummins et al. 2003), the SWLS (Diener et al. 1985) and the BMSLSS (Seligson et al. 2003). Three well-known single-item scales were used: one on overall life satisfaction (OLS), another on overall happiness (HOL), and Fordyce’s single-item scale. Additionally, two items from Russell’s scale (2003) on core affects (CAS) were also used, one on happiness and the other on satisfaction. Additionally, item 6 of the BMSLSS and item 3 of the SWLS were also tested separately.

Scores for all items were collected using a 0–10 scale following Cummins & Gullone’s (2000) recommendations and scores for all multi-item scales were transformed into a 0–100 scale to facilitate comparison. Table 1 provides a summary of the labels used for each scale.

Table 1 Characteristics of scales included in this research

2.1.1 Personal Well-Being Index (PWI)

This scale was designed by Cummins, Eckersley, van Pallant, et al. (2003) as part of the Australian Unity Wellbeing Index. Originally, it included 7 items on satisfaction with different life domains. In recent years an additional item has been proposed: satisfaction with spirituality or religious beliefs (Wills 2009). This additional item has proved problematic when comparing different socio-cultural contexts, at least among adolescents, as pointed out by Casas et al. (2009). It was therefore not included in this study as a single item, but split into two items.

The scale used has an end-labelled format, from completely dissatisfied (0) to completely satisfied (10).

In order to analyze results across countries, we present 3 calculations:

  • The PWI index with the original 7 items (PWI-7).

  • The PWI-7 plus the scores for an item on Satisfaction with own spirituality.

  • The PWI-7 plus the scores for an item on Satisfaction with own religion.

This scale was originally designed for use with adult populations. However, it has been tested on 12-year-olds and older adolescents in some countries (Romania, Spain) and shown good psychometric properties. A school-children’s version (PWI-SC) has been developed by Cummins and Lau (2005) and tested on Australian and Chinese populations. This version exchanges “satisfaction” with life domains for “happiness” with each life domain. Testing on the Catalan population has raised doubts regarding whether this is an appropriate change for speakers of other languages, as meanings differ. We therefore decided to retain the phrasing of the adult version for our adolescent samples. In fact, Tomyn and Cummins (2010) also kept “satisfaction” when collecting data from a sample of Australian adolescents using the PWI-SC.

The psychometric properties of the PWI have been published in several articles (see, for example, Lau et al. 2005; International Wellbeing Group 2006). Cronbach’s α was originally reported to lie between 0.7 and 0.8. The seven original domains form a single component and predict over 50% of the variance for “satisfaction with life as a whole” (Cummins et al. 2003) with adult samples.

One of the items on this scale refers to “satisfaction with feeling part of your community”. This item could not be used in Spain as the concept of “community” has completely changed in modern Spanish urban life to mean something very different to that intended in the original English version. In Spanish urban life “community” denotes the “cluster of owners of apartments in the same building”. The word does not have the friendly connotations still awarded it in Argentina, for example – in Spain communities of owners are actually thought of as being unfriendly. No other word has substituted the original friendly meaning of “community” in everyday language, and many children and adolescents in the original testing reported not to understand the question. For this reason, prior research conducted by the Spanish team with this scale has used an alternative item regarding “satisfaction with the groups of people you belong to”. With this item the scale shows a good performance and the seven items also load on a single component explaining approximately 47% of the variance (Casas et al. 2009).

In Chile, the research team decided to use the same adapted item on satisfaction with groups as in Spain, while in Argentina they decided to keep the original item on satisfaction with the community. In Brazil they decided to keep the translation of the original item in English as well as the new item regarding “satisfaction with the groups of people you belong to”.

2.1.2 Satisfaction with Life Scale (SWLS)

This scale was designed by Diener et al. (1985). It includes 5 items and responses were originally coded on a scale of 1–7 according to level of agreement. Several Spanish adaptations exist. Most Spanish adaptations have changed item 5 due to its negative wording, so scores for this item must be reversed before adding together the 5 items to calculate the index for the overall scale.

For this research, we have re-translated the original English version to adapt it to the different local variations of Spanish language in each country involved in this data collection. However, only the Argentinian team used a double-negation wording for item 5 that could be understood by adolescents and did not need to be reversed. This item will be specifically analyzed, because according to our previous experience with Spanish samples its performance among adolescents is suspected to be different than among adult populations.

Some authors have deemed this item problematic, even suggesting that it might reflect other areas more than well-being (Veenhoven 1994, 2009). High scores in response to the phrase “If I could live my life over, I would change almost nothing” may reflect more a conformist personality than personal well-being. People evaluating new experiences may be happy with their lives but not want to repeat the same experiences – considering new experiences to be more desirable. This may be the case for many young people during adolescence.

In all countries, the 1–7 scale was changed to a 0–10 scale in order to make it more sensitive, and also for easier comparison of results with other instruments using a 0–10 scale. Labels have been given to each value from strongly agree to strongly disagree.

Item 3 of this scale refers to agreement with the phrase I am satisfied with my life. Although this phrasing does not include a reference to “overall life” or “life as a whole”, it is probably understood in this way by most adolescents in the four countries object of study. Its meaning therefore risks being redundant with the single-item on overall life satisfaction.

In order to analyze results across countries we present 4 calculations:

  • The complete SWLS index, transformed into a 0–100 scale.

  • The SWLS index without item 3, transformed into a 0–100 scale.

  • The SWLS index without item 5, transformed into a 0–100 scale.

  • Item 3 as a single indicator, on a 0–10 scale.

The psychometric properties of this scale have been published in various articles. See, for example, Pavot and Diener (1993). The originally reported Cronbach α was 0.87 (Diener et al. 1985). In the original testing, a single component accounted for 66% of the variance.

2.1.3 Brief Multidimensional Students’ Life Satisfaction Scale (BMSLSS)

This scale was designed by Seligson et al. (2003) and developed for use with students aged 8–18. It includes 5 items referring to satisfaction with different life domains. Responses were originally coded on a scale of 1 to 7, from terrible to delighted. The psychometric properties of this scale have been published in different articles. A 0.68 Cronbach’s α was originally reported (Seligson et al. 2003).

We have changed the 1–7 scale to a 0–10 scale in order to make it more sensitive. Labels have been given to each value, describing satisfaction with each life domain from terrible to delighted. The overall aggregate score has been transformed into a 0–100 scale.

2.1.4 Single Item on Overall Life Satisfaction (OLS)

The importance of including a single-item scale on overall life satisfaction when studying subjective well-being has been highlighted by Campbell et al. (1976). In our research, we included a question on Satisfaction with your overall life, using an end-labelled 0–10 scale, from completely dissatisfied to completely satisfied. The translation of the item back into English is: Up to now and at present, to what extent are you satisfied with your life as a whole?

2.1.5 Items Regarding Happiness and Satisfaction on Russell’s Scale (CAS)

Russell’s scale on core affects (2003) is based on the following question: Please indicate how each of the following items describes your feelings when you think about your life in general. Then a list of affects is presented, including an item on satisfaction and one item on happiness.

In our questionnaire we have included the original question with these two items, to be evaluated from 0 to 10, with the extreme values labelled from not at all to absolutely.

2.1.6 Single Item on Happiness Taking into Account Overall Life (HOL)

Campbell et al. (1976) also pointed out the importance of including a single-item scale on happiness when studying personal well-being. In our research we have included the following question: “Taking into account your overall life, would you say you are…?” and then options are offered using a 0–10 scale, from extremely unhappy to extremely happy. Only the extreme values are labelled.

This item was not included in the Argentinian questionnaire.

2.1.7 Fordyce’s One-Item Scale on Happiness

Fordyce’s scale is a single-item scale asking In general, to what extent do you usually feel happy or unhappy? (Fordyce 1988). Answers are given on an 11-point scale from completely unhappy to completely happy. Each value has a label.

2.1.8 Single Item on Satisfaction with My Overall Life

Authors of the BMSLSS add an item on overall life satisfaction to offer the possibility of using a 6-item version, this version showing a higher Cronbach α (Seligson et al. 2003). The item is worded “I would describe my satisfaction with my overall life as …”. However, according to some authors (Campbell et al. 1976; Cummins and Cahill 2000), an overall life satisfaction item should be considered at a higher level of abstraction than satisfaction with life domain items and should not be included on the same scale. We therefore report it as a single item.

2.2 Procedure

A two-stage cluster sampling design was used in each country to select the sample of adolescents. In the first stage, we randomly selected a number of secondary schools in the chosen region of each country (Gran Buenos Aires in Argentina, Rio Grande do Sul in Brasil, Valparaíso in Chile, Catalonia in Spain). At each school we proceeded in accordance with the regular ethical guidelines for administering questionnaires to children in each country. In Catalonia and Chile, the aims of the research were reported to the school director and to the parents' association in order to receive permission. In Brazil, the personal informed consent of each parent was requested.

When a school agreed to participate, we randomly selected a number of classes until we filled a quota for each age group from each school. We then asked for cooperation from the class teacher. Following approval, and as soon as the ethical and formal procedures were concluded, the children were asked for their cooperation and were informed that their data would be treated confidentially and that they were free to refuse. The questionnaires were administered to the whole group in their regular classroom. One of their usual teachers and one or two researchers were present during the administration, and clarified any questions that arose.

2.3 Samples

The 8 scales studied here were included in a broader questionnaire regarding leisure time and media consumption. It was administered to three samples of adolescents aged 12 to 16 in Argentina (n = 342), Brazil (n = 1,588) and Spain (n = 2,900), and to a sample of adolescents aged 14 to 16 in Chile (n = 843). The characteristics of the overall sample in terms of age, gender and country are presented in Table 2. The two extreme age groups have more or fewer subjects depending on the period of the school year when the questionnaires were administered.

Table 2 Sample of adolescents aged 12 to 16, by age and gender

School years are not organized identically in each country studied. In Brazil compulsory schooling finishes at 14, in Spain and Argentina at 16 and in Chile at 18. The names and levels of years also differ.

With regard to the representativeness of samples in relation to the overall population it must be emphasized that the sample from Brazil only includes those students continuing in formal post-compulsory education after 14, and lower-class adolescents from these two countries are therefore probably underrepresented in our samples.

Females comprise 55.8% of the overall sample. They make up 65.6% of the Brazilian sample, 54.2% of the Chilean sample and 50.9% of the Spanish one. The sample in Brazil includes far more girls than boys due to the ethical sampling procedure, which requires explicit consent from parents. The research team reported that they always obtain more signed informed consent from girls’ parents, girls of this age range being more active in bringing the consent back to school.

3 Results

3.1 Missing Values

Overall missing value rates for each scale are reported in Table 3. In general, the Spanish sample shows more missing values for almost all scales, with the exception of the BMSLSS, which has a higher % in Argentina.

Table 3 Missing values by scale and country

In all countries, the PWI shows a higher % of missing values than the SWLS, BMSLSS or any of the other single items, probably because it has more items. Missing percentages increase when the items “satisfaction with spirituality” and “satisfaction with religion” are included.

As the samples are large, all calculations have been made using listwise deletion, with no substitution of missing values, except for structural equation modeling, which has used missing value imputation by regression.

3.2 Exploratory Data Analyses

Skewness of the variables ranged from −1.023 to −1.885 and kurtosis from 0.982 to 4.118. These departures from normality were handled in structural equation models by means of bootstrap, as implemented in the AMOS 19 software. Outliers were checked using Mahalanobis’ distance. Constant extreme answers were also searched. 17 subjects of the overall sample showed constant extreme positive answers to all items of the PWI + OLS + SWLS. These 17 subjects were deleted (11 from Spain, 1 from Brazil, 2 from Argentina and 3 from Chile), as such constant answers are suspected not to be reliable. No subject showed constant extreme negative answers.

3.3 Cronbach’s α for the 3 Multi-item Scales

Details of the Cronbach’s α results obtained are shown in Table 4. PWI-7 shows a rather high α when including both the item on “satisfaction with community” and “satisfaction with groups you belong to”, although it is slightly higher for the first. However, when adding “satisfaction with spirituality”, the α falls slightly and when adding “satisfaction with religion” it clearly decreases.

Table 4 Cronbach’s alpha for different variations of PWI, SWLS and BMSLSS. Pooled sample for 4 countries

Item 5 of the SWLS proved to be problematic with adolescents, as with previous results in Spain. If deleted, the α increased clearly.

In both the SWLS and the 6-item version of the BMSLSS, if the overall direct item on life satisfaction is deleted (items 3 and 6 respectively) each α clearly decreases, showing these items to have a higher weight than any of the other items on each scale and supporting the need to analyze them separately (Campbell et al. 1976; Cummins and Cahill 2000).

Most alphas are higher than those reported in the original articles by designers of the scales, although the SWLS and PWI referred to an adult population. The SWLS has a clearly lower α than the other two scales.

3.4 Principal Component Analysis for the 3 Multi-item scales

Items of all variations of the 3 scales show high percentages of variance explained by a single principal component. However, explained variance changes clearly in some cases. In the case of the PWI, the highest explained variance is related to the inclusion of “satisfaction with community” and to the exclusion of “satisfaction with spirituality” and “satisfaction with religion” (Table 5).

Table 5 PCA explained variances by a unique component

Explained variances are noticeably lower for all scales than those originally reported by the authors, although the SWLS and PWI referred to adult populations.

3.5 Sensitivity of Well-Being Averages to Scale Choice

Table 6 shows overall and by country means. For comparative purposes, they have been computed from the subsample in the 14–16 age range, which is available for all countries. Age groups have been given the same weight in all countries.

Table 6 Mean results for each country with the different scale variations and items on adolescents’ well-being (14–16)(1)

Brazil is the only country in which both item wordings were administered: satisfaction with community and satisfaction with groups of people you belong to. With data from this country we can compare results with or without each of these items. In Table 6 we observe that all variations using satisfaction with community score more than one point lower in means for the total population than when using satisfaction with groups you belong to. A hypothesis to be tested in the future is whether scores in Argentina would also increase if the item on groups you belong to were used instead of the item on community.

In all 4 countries inclusion of the item on satisfaction with spirituality makes the overall mean decrease. Inclusion of satisfaction with religion decreases scores even more clearly in Brazil, Chile and Spain, but not so in Argentina, where satisfaction with religion is about equal to satisfaction with spirituality.

On a 0 to 100 scale, SWLS scores tend to be much lower than those of any other scale. Most variations of the PWI tend to have higher scores than the BMSLSS in the four countries studied. In both the Brazilian sample and the pooled sample for the 4 countries the highest scores are found for the item on happiness on the CAS scale. However, in Spain and Chile the highest scores appear for item 3 of the SWLS, and in Argentina for item 6 of the BMSLSS.

If item 3 of the SWLS is deleted, scores for this scale decrease dramatically in all countries. The opposite happens when item 5 of the SWLS is deleted. We have found no evidence of this being observed among adult populations and we therefore think it probably a phenomenon characteristic of adolescent populations. As already pointed out in the introduction, we suspect that many adolescents, despite considering their life experiences to be very satisfactory, view changes and new experiences in life as being more attractive than an imaginary repetition of their life.

The two single-item scales we have used to explore happiness (HOL, Fordyce’s) tend to score lower than most of the others, excepting the SWLS. However, and surprisingly, this is not the case with the item on happiness on the CAS scale, as mentioned above. The fact that items including the wording “happiness” (rather than “satisfaction”) show such differences suggests that more research must be done (at least with adolescents) to understand the effects of the different wordings of items.

3.6 Average Differences by Country

For 14–16 year-olds, that is, the only age range for which we have data from all 4 countries (Table 6), Brazil has the highest levels of personal well-being among adolescents, above the expected range for western countries (70–80) using the PWI-7. However, the adolescents from the other 3 countries are also in the upper part of this range.

Surprisingly, according to the SWLS scores, Brazilian adolescents are the ones with the lowest well-being among the 4 countries, Chile showing the highest.

With both the BMSLSS and the OLS Brazilian adolescents once again show the highest scores, and Spain the lowest.

On the other hand, with the above two scales Spain and Chile show significantly decreasing scores with age for the studied range. This tendency is only reflected in Chile with the PWI-7, however, and in Argentina and Chile with the SWLS.

Depending on the measure used, not only overall country scores, but also the tendency of scores to decrease with age shows slightly different characteristics in each country, as shown in Table 7.

Table 7 Trend of decreasing with age according to different measures by country

In Argentina, by contrast with all other countries, an important increase is observed in most of the scales at the age of 15. Such a difference in adolescent well-being may be simply due to the larger sampling variance in this country, and should be more explored in the future.

Our results suggest that more detailed future analysis would help to clarify such differences, which we suggest may be related to at least the following three factors:

  • Each scale may be sensitive to socio-cultural aspects of the different contexts where they have been administered. We are as yet unaware of what these might be.

  • Each scale may measure different aspects of subjective well-being. Although correlations among them are high, they may be capturing real differences between countries.

  • The different samples in each country may include different kinds of biases and each scale may be sensitive to some of such biases.

Gender analysis of the results shows interesting differences among countries that require future in-depth analysis.

3.7 Correlations Among All Scales and Single Items

In order to assess the degree to which the different scales and single items overlap and complement one another, we computed all zero order Pearson correlations. In this section we again use the pooled sample of adolescents aged 12 to 16 years old (Table 8). The three studied multi-item scales show high correlations with each other. The PWI-7 using the item of satisfaction with groups you belong to (Brazil, Chile and Spain) shows a correlation of 0.593 with the SWLS and 0.610 with the BMSLSS. The correlation between the BMSLSS and the SWLS is 0.624.

Table 8 Correlation of different scale variations and items on 12–16 y.o. adolescents’ well-being

Although the sample size is reduced, these three correlations slightly increase if we take only the 14–16 age range in the 4 country samples: 0.594, 0.615 and 0.632, respectively.

The different versions of the PWI show very high correlations with one another, but these are clearly lower when the item on satisfaction with religion is included.

However, when we take the item of satisfaction with the community (Argentina and Brazil) instead of satisfaction with groups belonged to, the correlations change: the correlation increases between the PWI and the BMSLSS (0.621), but between the PWI and the SWLS (0.592) it slightly decreases for the 12–16 age range. Both correlations are lower for the 14–16 age range (0.600 and 0.586).

All of these results support Diener’s proposal that all such positive constructs may form part of a kind of supra-construct.

The single item on overall life satisfaction (OLS) shows the highest correlation with the PWI-7 when it includes the wording of satisfaction with groups of people you belong to (0.581). All other correlations between the OLS and the other well-being scales and items here studied are around or above 0.5 when considering only the 14–16 age range. This supports Cummins’ proposal that overall life satisfaction should be considered as a core construct.

The highest correlation between the OLS and a single item is observed with item 6 of the BMSLSS, meaning these two items are fairly redundant.

In general, the lowest correlations observed for all multiple and single-items are with Fordyce’s scale on happiness – although these are all still quite high. Most correlations are clearly higher with the single-item on happiness with overall life (HOL) than with Fordyce’s scale, reflecting the fact that the phrasing of the question leads to differences in response, but probably also showing that differences arise according to whether all values of the scale are labelled or not.

3.8 Regressions of Scale Items on OLS and HOL

With the aim of testing the contribution of the PWI, SWLS and BMSLSS to the higher order constructs usually considered in international debates, ordinary least squares linear regressions were fitted to predict the single items for Overall Life Satisfaction (OLS) and for Happiness with Overall Life (HOL). The individual items on each of the three multiple-item scales were used as predictors, including the items reflecting the different variations of the scales. Regessions were estimated on the pooled sample of the 4 countries. Table 9 displays data regarding which items are non-significant in each case, and the explained variance (R 2 adjusted for degrees of freedom).

Table 9 Least Squares linear regressions of OLS on the items of the 3 scales. 12–16 year-old adolescents

All different item combinations on the PWI clearly show higher adjusted R 2 when regressed on the OLS than when regressed on the HOL. The SWLS also shows slightly better adjusted R 2 when regressed on the OLS, but the BMSLSS items show better adjusted R 2 when regressed on the HOL.

When we regress overall scores for the different combinations of the 3 scales on OLS, all adjusted R 2 are lower than for the same regression on the HOL.

Results suggest that the PWI-7 functions best in relation to the OLS when the item on satisfaction with groups you belong to is included and the item on satisfaction with spirituality is not included. However, the PWI-7 functions best in relation to the HOL when the item on satisfaction with groups you belong to is included and the item on satisfaction with religion is not included.

Surprisingly, the BMSLSS item on satisfaction with friends is not significant when regressed on the OLS using the 6-item version, but is when regressed on the HOL. We have found no previous reports of such results. Satisfaction with friends clearly seems to be an important aspect of adolescents’ life. Therefore, this is probably a perverse consequence of the collinearity of item 6 with the OLS.

It is also surprising that the item on satisfaction with personal safety is non-significant when the PWI version with the item satisfaction with community is included and regressed on the HOL. This item on personal safety is significant when the regression is on the OLS.

Our results suggest that at the studied age range the wordings “happiness” and “life satisfaction” carry important differences in meaning, although they do correlate.

3.9 Relationship Between the 3 Multi-item Scales, Age, the OLS and the HOL

The different psychometric scales measuring subjective well-being and other psychological positive constructs may belong to the same construct and thus overlap to a great extent (Stones and Kozma 1985; Diener et al. 1999). We decided to explore different structural equation models to ascertain the extent of this overlap between the three scales and the overlap with the single OLS and HOL items, and to relate subjective well-being to age. Only three countries are used, as our Argentinian sample is too small for this purpose. What is more, we did not have data for the HOL in Argentina.

In order to assess the validity of the factorial structure across the three countries, we tested different models using the AMOS 19 structural equation modeling (SEM) software. Maximum likelihood estimation was used. Given the fact that the data distribution deviates from multivariate normality, we report bootstrap rather than ML inferences. Fit statistics for the different models analyzed are presented in Table 10: CFI (Comparative Fix Index), RMSEA (Root Mean Square Error of Approximation) and SRMR (Standardized Root Mean Square Residual). We assume that acceptable results are those in which CFI is above .950 and RMSEA and SRMR below 0.05 (Batista-Foguet and Coenders 2000; Arbuckle 2010; Byrne 2010). However, according to some authors, RMSEA values up to as high as .08 represent reasonable errors of approximation in the population (Browne and Cudeck 1993; Byrne 2010).

Table 10 Fit statistics for the SEM (models 1 to 5, 8, 10 and 12 are single group on the pooled data set. Multi-group models include Spain, Brazil and Chile)

Initial confirmatory factor analysis models of each multi-item scale with the pooled sample of all 3 countries, relating the items to a single latent variable for each scale with no constraints or correlated errors, did show proper goodness of fit statistics for the BMSLSS and SWLS (Models 1 and 2 in Table 10) - although RMSEA was above the recommended values in the two cases - but did not for the PWI. By allowing some errors to covariate, we fitted a modified model showing very good fit statistics for the PWI (Model 3). The errors we allowed to covariate are: (1) satisfaction with interpersonal relationships and satisfaction with groups you belong to; (2) satisfaction with standard of living and satisfaction with health; and (3) satisfaction with standard of living and satisfaction with life achievements.

In the next step, we created an overall model using the models of all three scales, and correlated the 3 latent variables. The initial model did not show a proper fit. However, a modified model allowing some errors to covariate did (Model 4). The errors we allowed to covariate are:

  • One item of SWLS (item 2) with one item of the PWI (satisfaction with standard of living).

  • Two items of the BMSLSS with 3 items of the PWI (satisfaction with school and satisfaction with future security; satisfaction with friends and satisfaction with personal relations; satisfaction with friends and satisfaction with groups you belong to).

  • Two pairs of items from the PWI (satisfaction with standard of living and satisfaction with health, and satisfaction with groups of people you belong to and satisfaction with personal relationships).

The loading of item 5 on the latent variable SWLS was much lower than any other item (.360) and it was therefore removed from further models (Model 5).

Standardized factor loadings of the pooled sample range from .47 (satisfaction with groups you belong to and satisfaction with friends) to .82 (item 3 of the SWLS), indicating how the latent variable predicts each of the observed variables (Fig. 1). The lowest correlation appears to be between the PWI and the BMSLSS (.75), while the highest are between the SWLS and the BMSLSS (.85), suggesting that the SWLS and the BMSLSS are measuring a very similar construct, while the PWI is measuring a rather different construct.

Fig. 1
figure 1

Standardized estimates for Model 5. Pooled sample of 3 countries

In order to compare SEM coefficients across groups (across countries in our case), metric factor invariance must first be found to be tenable. Factor invariance refers to the extent to which items used in survey-type instruments mean the same to members of different groups and is a requisite before the groups can be compared in a meaningful way; otherwise, group differences in means or regression coefficients could be attributable to true differences in group distributions or to a different meaning of variables (Meredith 1993). The simplest test of metric invariance consists in first fitting a multiple-group SEM to data from all groups (in our case, countries) by allowing all parameters to be different across groups (i.e. without imposing any equality constraint: Model 6). If this model fits the data well, the next step is to constrain the unstandardized factor loadings (in our case of the PWI, BMSLSS and SWLS indicators) to make them equal across groups. Metric factor invariance is tenable if the fit of the constrained model is not considerably worse than the fit of the unconstrained model (Brannick 1995; Kelloway 1995; Cheung and Rensvold 2002).

Model 7, restricting loadings to make them equal across countries in order to make data comparable across countries still shows a good fit, CFI and RMSEA being only marginally worse than those in Model 6.

Model 7 allows us to study the overlap between dimensions separately for each country. The correlation between the SWLS and the BMSLSS was .830 in Brazil, .933 in Chile and .841 in Spain; the correlation between the SWLS and the PWI was .801 in Brazil, .863 in Chile and .839 in Spain; and the correlation between the PWI and the BMSLSS was .752 in Brazil, .774 in Chile and .746 in Spain.

We then included the observed variable “age” in Models 5 and 7 as a predictor of the three dimensions, and the resulting Models 8 and 9 showed acceptable fit statistics. A small reduction in well-being with age is observed fairly equally across the 3 scales. The pooled standardized estimates (Model 8) are displayed in Fig. 2. The country specific standardized estimates (Model 9) are displayed in Table 11, together with their bootstrap confidence limits. Reduction of well-being with age is lower in Chile, consistent with the fact that the age range studied for this country is more limited (14–16) than for Brazil and Spain (12–16). Note that even if unstandardized loadings are constrained across countries, which is necessary for comparison purposes, their standardized counterparts can differ, reflecting lower or higher error variances.

Fig. 2
figure 2

Standardized estimates for Model 8. Pooled sample of 3 countries

Table 11 Standardized Regression Weights for the 3 scales and age (Model 9 in Table 10)

Models 10 and 11 are similar to Models 8 and 9, but include the OLS rather than age. Figure 3 shows the pooled standardized estimates, and Table 12 country specific standardized confidence intervals. Models 12 and 13 are once again similar with the HOL (Fig. 4 and Table 13). The goodness of fit of all four models is acceptable, as reflected in Table 10.

Fig. 3
figure 3

Standardized estimates for Model 10. Pooled sample of 3 countries

Table 12 Standardized regression weights for the 3 scales and the OLS (Model 11 in Table 10)
Fig. 4
figure 4

Standardized estimates for Model 12. Pooled sample of 3 countries (Model 12)

Table 13 Standardized Regression Weights for the 3 scales and the HOL (Model 13 in Table 10)

As regards the pooled estimates in Figs. 3 and 4, of the 3 multiple-item scales, the SWLS shows the highest regression weights on the OLS (.67), while the BMSLSS regresses highest on the HOL (.68). Interestingly, the PWI and SWLS show higher regression weights on the OLS than on the HOL, while the opposite is true for the BMSLSS. The lowest regression weight is that of the PWI on the HOL (.58). All of these results suggest that each scale measures rather different aspects of a good life.

As regards country-specific results, in Brazil, of the 3 multiple-item scales, the PWI shows the highest regression weight on the OLS (.589), while the BMSLSS regresses highest on the HOL (.674). Only the PWI shows higher regression weights on the OLS than on the HOL. Loadings in Tables 10, 11 and 12 are generally very similar. If we take, for instance, those for the regression on the OLS (Table 11), the standardized factor loadings of the 17 items range from .433 (satisfaction with friends, from the BMSLSS) to .844 (item 2 of the SWLS).

In Chile, of the 3 multiple-item scales, the SWLS shows the highest regression weight on the OLS (.748), while the BMSLSS regresses highest on the HOL (.699). The PWI and SWLS show higher regression weights on the OLS than on the HOL. Standardized factor loadings of the 17 items range from .461 (satisfaction with health, from the PWI) to .850 (item 2 of the SWLS) when the OLS is included in the model.

In Spain, of the 3 multiple-item scales, the SWLS shows the highest regression weights on the OLS (.699) and on the HOL (.696). The PWI and SWLS show higher regression weights on the OLS than on the HOL. Standardized factor loadings of the 17 items range from .473 (satisfaction with groups, from the PWI) to .809 (item 2 of the SWLS) when the OLS is included in the model.

As a provisory conclusion we may say that results show only slight differences among scales, and also among countries. However, differences require further testing in the future.

4 Discussion

Our results show that adolescents in the four studied countries score high in their subjective well-being, around the highest expected upper limit according to normative values in Western countries (Cummins and Cahill 2000), and even above that limit in Brazil. However, it is also evident that their well-being decreases with age in all 4 countries.

It is difficult to be precise with the comparisons because the different wording of each item may have a slightly different meaning in different languages or cultural contexts. For example, in Spain an item on satisfaction with community had to be replaced with an item on satisfaction with the groups of people you belong to. In Brazil, when the first of these two items was included to calculate the Index, scores were always higher than when the second item was included instead.

The three multi-item scales used here explore slightly different life domains or aspects of life. For example, only the BMSLSS directly explores satisfaction with family and satisfaction with oneself. These items present very high standardized regression weights in all our models in the 3 countries for which we were able to analyze results using SEM. This is also the case with the item on satisfaction with achievements in the PWI. However, satisfaction with the family becomes “diluted” in the PWI for the item on satisfaction with relations with other people and standardized regression weights are not that high.

The fact that our models show good fit statistics when using multi-group SEM is a very promising result, showing that although there may be many limitations in the precise comparability of items and scales, the overall measures we have used seem to be comparable among countries for this age group.

The ten studied scales (3 multiple-item, 3 single-item, and 4 single items from a larger scale) and their variations have shown various strengths and weaknesses when used with 12–16 year-old adolescents in four Latin-language-speaking countries. All of them seem to have enough qualities to be potentially used as subjective well-being indicators in the international arena. However, it is also true that results will slightly or even clearly differ across countries depending on the scale used.

The scales including the wording “happiness” show a broader diversity of results than the scales including the wording “satisfaction with life”. Although it seems clear that among adolescents of the studied age ranges the two concepts are not understood to have the same meaning, they do appear to be related (correlation observed in the pooled sample for the 4 countries is .544). However, SE Models with the three scales, including the OLS and HOL alternatively (Figs. 3 and 4), show very similar results, with standardized loadings that are high, but not extremely high, suggesting that other additional indicators are probably required for a better explanation of the two constructs. The different understanding of the two wordings offers us a complex panorama of differences in the statistical results, which will require further exploration.

When comparing the 3 multiple-item scales studied here, some of the most notable results are:

  • The SWLS has the highest R 2 when regressed on the OLS. However, these results are probably contaminated by item 3 of the SWLS, which is a higher order item, at least partially redundant with the OLS. With structural equation modelling it shows the highest regression weights on the OLS in Chile and Spain, probably for the same reason, and also on the HOL in Spain. Furthermore, item 5 is problematic with adolescents. Cronbach’s α would clearly increase if deleted. However, if item 3 is deleted all SE Models become unidentified, raising significant doubts about the suitability of using this scale with adolescents.

  • The PWI version with the item on satisfaction with groups you belong to (rather than the original item on satisfaction with community) shows the second best R 2 on the OLS. However, the PWI shows the lowest R 2 when regressed on the HOL. The PWI also displays more missing values in the four countries than the other two scales, probably due to the fact that it has more items. With structural equation modelling, a modified model of this scale shows much better fit statistics than the other two. However, it only shows the highest regression weights on the OLS in Brazil.

  • The BMSLSS shows the lowest missing percentage values in two countries (Chile and Spain) and higher explained variance among adolescents than the PWI in the PCA. The original 5-item version only includes significant items when regressing on the OLS, but this is not the case for the version including an additional item. With structural equation modelling the original version shows the highest regression weights on the HOL in Brazil and Chile. This scale shows higher regression weights on the HOL than on the OLS in the 3 countries studied.

  • When including only the 3 latent variables referring to the 3 scales in the same SEM (Model 4, Fig. 1), the lowest correlations appear to be between the PWI and BMSLSS, while the highest are between the SWLS and BMSLSS. This result seems to be counter-intuitive, suggesting that the SWLS and BMSLSS are measuring a very similar construct, while the PWI is measuring a rather different construct – conceptually, however, the PWI and BMSLSS are both measuring life domains, while the SWLS is a “context-free” scale, and therefore very different to the other two. Being context-free, it can somehow lie in between the two specific contexts posed by the remaining two scales, which may also contribute to the higher correlations.

When expanding the first order model by including the OLS on the one hand or the HOL on the other, we encountered strong relationships between the single item and multi-item measures.

When age is added to our SE Models including the 3 scales together it becomes clear that well-being regularly decreases with age within this age range. This is less so in Chile, consistent with the fact that the age range studied for this country is more limited (14–16) than for Brazil and Spain (12–16).

Our results suggest that much research work must still be conducted. An initial consequence of our results is to recommend that future research for cross-country comparability collect data using more than one of the scales analyzed here, and with at least two single-item scales, one on overall life satisfaction and another one on happiness with overall life. We need much more data and from more countries to analyze in any real depth the qualities and possible weaknesses of each scale for the international comparison of adolescent populations.

Secondly, the fact that the correlations among the 3 scales are high but not perfect means that the 3 scales are somehow complementary and their items can be considered as part of a broader scale including 3 major components, although some of the items share covariance. Some authors have even hypothesised that different psychometric scales measuring subjective well-being and other psychological positive constructs may belong to a same higher order construct (Stones and Kozma 1985; Diener et al. 1999). However, it is important to take into account that when three first order factors are present, the first and second order models are equivalent and thus not testable against each other. We did try to fit the second order multiple group factor analysis model, but the estimates from Chile included a negative residual variance for the SWLS. Therefore, although the high correlations among dimensions seem to support the hypothesis of an existing supra-construct, further research will be required to explore this matter, ideally with four or more dimensions.