Introduction

Subjective well-being (SWB) in school refers to students’ subjective cognitive evaluations and emotional experiences in their everyday school lives based on their own standards (Tian 2008). This construct derives from Diener’s theory of SWB (Diener 1994). Based on this widely accepted theory, SWB in school is comprised of school satisfaction and affect in school. Specifically, school satisfaction reflects students’ subjective, cognitive evaluations of several aspects of school life (e.g., school management, teacher-student relationships) based on their own criteria. Affect in school includes positive and negative affect, which refers to the frequency of positive and negative feelings students experience during school, respectively.

In light of the large amount of time that students spend in school, students’ subjective schooling experiences play an important role in their learning and life outcomes (Park 2004). Previous research has shown that SWB in school is related to a number of important variables. For example, students with higher levels of SWB in school report higher levels of school belonging, engagement behaviors, and academic achievement (Elmore and Huebner 2010; Suldo and Shaffer 2008). Furthermore, students with higher levels of SWB in school also report fewer internalizing and externalizing behaviors, such as anxiety, depression, delinquent behaviors, and aggressive behavior (DeSantis-King et al. 2006; Suldo et al. 2015).

Many studies have been conducted to investigate SWB in school across different age groups to explore its stability and change across different development stages (Konu and Lintonen 2006; Liu et al. 2016; Park 2005; Suldo and Huebner 2006; Tomyn and Cummins 2011). For example, Park (2005) conducted a study of the development of global and domain-specific satisfaction with 736 South Korean students in elementary, middle and high schools, and the results showed that students’ school satisfaction (i.e., cognitive component of SWB in school) scores decreased with age. For another example, with a sample of 2158 Chinese mainland students in elementary, middle and high schools, Liu et al. (2016) also found that the older students reported lower levels of SWB in school than the younger students. Although such studies generally support the conclusion that students’ SWB in school shows a continuous decline with age, it should be noted that these differences could be explained in many ways. For example, the differences may be caused by varying levels of students’ cognitive ability. The interpretation of such differences raises the question on whether such mean score differences reflect real differences in SWB in school or student differences in their understanding of the construct or items of the measure used in the studies. To avoid this problem, researchers should evaluate and demonstrate measurement invariance across the groups of interest before proceeding to the next step in studies of between-group differences, that is, interpreting the meaning of mean score differences between the groups (Byrne and Campbell 1999; Chen 2008). It should be noted that without evidence of measurement invariance, the interpretation of research findings can be compromised.

Therefore, it is necessary to study the measurement invariance across age groups of the scales that are used to evaluate SWB in school. In assessing students’ SWB in school, the Adolescents’ Subjective Well-Being in School Scale (ASWBSS; Tian 2008) can comprehensively assess SWB in the school context across both the cognitive and affective components. The ASWBSS requires students to report their school satisfaction (cognitive component) and the frequency of positive and negative affect in school (affective component) simultaneously. Psychometric evidence for the applicability of the ASWBSS to Chinese adolescents has been provided (e.g., Tian and Gilman 2009). However, the ASWBSS includes 50 items, and the use of the full version of the scale requires considerable time and resources on the part of students to complete.

In order to address these problems with the ASWBSS, Tian et al. (2015) developed a brief version of the ASWBSS, that is, the Brief Adolescents’ Subjective Well-Being in School Scale (BASWBSS). The BASWBSS includes two subscales. The Affect in School subscale consists of two items that assess students’ reports of the frequency of positive and negative affect experienced by them during school. The School Satisfaction subscale consists of six items that assess students’ judgments of their satisfaction with specific domains of school (i.e., achievement, school management, teacher-student relationships, peer relationships, teaching, and academic learning).

Research with the BASWBSS has supported its psychometric properties among Chinese elementary school students, demonstrating that the scale displays good internal consistency, construct validity, and criterion validity (Tian et al. 2016). Similar results in studies involving both elementary and middle school students have been reported from Turkey and Chile (Benavente et al. 2018; Özdemir and Sağkal 2016). As the BASWBSS has shown good psychometric properties in different countries and different school levels, it has been used with students spanning the elementary, middle, and high school years (e.g., Asici and Ikiz 2019; Tian et al. 2015; Yang et al. 2019; Zhou et al. 2019). For example, using the BASWBSS, Yang et al. (2019) explored the relations among SWB in school, academic achievement, and self-esteem among elementary school students. For another example, Liu et al. (2016) examined age differences on SWB in school scores among elementary, middle and high school students.

Although the BASWBSS serves as a suitably brief, yet valid measure of students’ SWB in school, studies of its measurement invariance across age groups are still non-existent. It should be underscored that lacking evidence of measurement invariance, researchers cannot determine clearly whether the mean score differences are due to real differences or due to differences in measurement equivalence across groups (Brown 2015). Moreover, without evidence of measurement invariance, studies of mean differences, correlates, and so on remain questionable (Chen 2008).

Therefore, our aim in the current study was to test the measurement invariance of the BASWBSS across three school levels (i.e., elementary, middle, and high school) with Chinese children and adolescents. Specifically, elementary school students represented children, and middle and high school students represented adolescents. In the stage-environment fit hypothesis, researcher suggests that schools are systems characterized by multiple levels of regulatory processes (organizational, social, and instructional in nature) that regulate students’ cognitive, social-emotional, and behavioral development (Eccles 2004), and these processes may vary across different school levels. Moreover, according to Cole et al. (2001), developmental changes and school transitions often co-occur, so students’ differences will be more pronounced among different school levels. Therefore, for this study, we selected “school level” as the relevant grouping variable, rather than “age level”. We subsequently employed multi-group confirmatory factor analysis (MGCFA) procedures to test for measurement invariance across these groups. If scalar invariance was obtained, then the latent mean analyses would be conducted. Previous studies indicated that older students reported less positive school evaluations and experiences than younger students (Nickerson and Nagle 2004; Park 2005), therefore we hypothesized that as school level increased, students would report lower scores on the BASWBSS. Meanwhile, as there is widespread scholarly interest in the topic of gender differences, and the measurement invariance of the BASWBSS across gender has been established (Tian et al. 2015), a comparisons of latent mean differences across gender was also conducted in the present study.

Method

Participants

Participants included 799 students (46.8% girls) in public schools from a city in southern China, with a mean age of 13.24 years (SD = 2.45; age range = 9 to 19). Of the participants, there were 329 (41.2%) students in elementary school from Grades 4 to 6, 303 (37.9%) students in middle school from Grades 7 and 8, and 167 (20.9%) students in high school from Grades 10 and 11. Elementary school students consisted of 200 boys and 129 girls. Middle school students consisted of 153 boys and 150 girls. High school students consisted of 72 boys and 95 girls. Almost all participants were from middle-income families.

Measures

Subjective Well-Being in School

Students’ subjective well-being (SWB) in school was measured with the Brief Adolescents’ Subjective Well-Being in School Scale (BASWBSS). It includes two subscales: School Satisfaction and Affect in School. The School Satisfaction subscale includes six items (e.g., “The teachers’ instructional methods and quality are good.”), addressing the six domains of school life as previously mentioned. Items are rated on a 6-point scale, ranging from 1 (strongly disagree) to 6 (strongly agree). The MacDonald ω coefficient for the School Satisfaction subscale was 0.87 in this study.

The Affect in School subscale includes two items (i.e., “In school, the frequency of my pleasant feelings is…”; “In school, the frequency of my unpleasant feelings is…”), and both of which are rated on a 6-point scale (1 = never, 6 = always). Tian et al. (2015) have provided evidence of the validity of these two items among adolescent students. They found meaningful and statistically significant correlations between these two items and the Positive Affect and Negative Affect in School subscales of the Adolescents’ Subjective Well-Being in School Scale (ASWBSS; Tian 2008) respectively.

In order to support the validity of these two items of the Affect in School subscale among elementary school students, a pilot study was conducted prior to the current analyses. More detail about the procedure is presented in the online supplementary materials. The results of bivariate correlations showed that the total of the five items describing specific positive emotions (e.g., delighted, proud) derived from the Elementary School Students’ Subjective Well-Being in School Scale (ESSSWBSS; Liu et al. 2015) and the total of the five items describing negative emotions (e.g., sad, angry) from the same measure significantly related to the corresponding affect items of the BASWBSS Affect in School subscale. Significant correlations between two positive affect measures and between two negative affect measures were found (r = 0.59, p < 0.01 and r = 0.61, p < 0.01, respectively). The results provided evidence of the validity of the Affect in School subscale among elementary school students.

The score for the School Satisfaction subscale was obtained by averaging the responses to the six items, and the score for the Affect in School subscale was obtained by subtracting the score for negative affect from the score for positive affect.

Procedure

The study was approved by the relevant school boards, principals, and teachers, as well as the Human Research Committee of South China Normal University. We informed students and their parents about the study, and we obtained student assent and parental consent before the data collection. During the administration of the measure, trained graduate assistants administered the measure to every student in regular classrooms, which included up to 40 students per classroom. The students could take as much time as needed to complete the measure. Participants were assured of the strict confidentiality of the collected data, and only research personnel had access to the completed questionnaires. In addition, assistants emphasized the importance of honest and accurate self-reporting for the study.

Data Analysis

Data were analyzed by SPSS 21.0 and Mplus 8.0 (Muthen and Muthen 2011). Baseline confirmatory factor analyses (CFA) were computed for each group separately. Next, tests of measurement invariance of the BASWBSS across school levels were conducted via MGCFA. The MGCFA method typically considers three different levels of measurement invariance: configural invariance, metric (weak) invariance, and scalar (strong) invariance (Davidov et al. 2012). We gradually increased constraints on the multi-group measurement model to examine whether the model continued to fit the data well. According to Little (1997), if configural invariance was supported (i.e., the same factor structure was obtained across the groups), then metric invariance (i.e., the factor loadings were equal across groups) should be tested. Finally, assuming demonstration of configural and metric invariance, we planned to evaluate scalar invariance. Scalar invariance assesses the equivalence of the item intercepts across groups. If full or partial scalar invariance was achieved, then, tests of latent mean differences could be conducted (Van de Schoot et al. 2012; Vandenberg and Lance 2000).

We chose multiple indices to evaluate model fit, including the relative chi square (χ2/df), Comparative Fit Index (CFI), Tucker-Lewis Index (TLI), Standardized Root Mean Square Residual (SRMR), and Root Mean Square Error of Approximation (RMSEA). χ2/df should be between 1.00 and 5.00 (Schumacker and Lomax 2004). Acceptable model fits were also considered with the CFI and TLI values close to or greater than 0.95, and the SRMR value smaller than 0.08 (Brown 2015) as the criteria. A RMSEA between 0.05 and 0.08 was considered reasonable (Marsh et al. 2004). As for the indices for evaluating measurement invariance, we used multiple indicators to assess change in model fit, including the difference in CFI (ΔCFI) and the difference in RMSEA (ΔRMSEA). According to Cheung and Rensvold (2002), metric and scalar invariance are achieved if ΔCFI is equal to 0.010 or less (ΔCFI ≤0.010). Also, ΔRMSEA should be smaller than 0.015 (Chen 2007).

If scalar invariance was established, comparisons of latent mean differences across school levels and gender would be conducted. The latent mean parameters of the reference group were set to zero while the latent mean parameters of the comparison groups were allowed to be estimated freely.

Results

The skewness of all items ranged from −1.32 to 0.06, and the kurtosis ranged from −0.35 to 1.89. These results indicated that the item scores reflected a reasonably normal distribution. Descriptive statistics for all items of the BASWBSS are shown in Table 1, and correlations for the items of the School Satisfaction subscale are shown in Table 2. All of the six items of the School Satisfaction subscale showed statistically significantly correlations with each other. In terms of the Affect in School subscale, positive affect was statistically significantly correlated with negative affect for each school level (elementary school students: −0.38; middle school students: −0.46; high school students: −0.36).

Table 1 Descriptive statistics for the BASWBSS items (N = 799)
Table 2 Correlations for the items of school satisfaction subscale (N = 799)

Measurement Invariance across School Levels

To address whether the BASWBSS items reflected comparable meaning across the school levels, a sequence of stepwise analyses was employed for the examination of measurement invariance. First, baseline CFAs were conducted for each school level separately. In each sub-sample, the two-factor model resulted in an adequate fit, and the results showed acceptable model fit. As shown in Table 3, acceptable values on fit indices were obtained for the two-factor model, which reflected separable school satisfaction and affect in school factors. In the complete sample, the results were as follows: χ2/df = 4.927, CFI = 0.967, TLI = 0.951, RMSEA = 0.070, SRMR = 0.030, and the factor loadings for the model are presented in Fig. 1.

Table 3 Fit Statistics for baseline model estimation of the BASWBSS (N = 799)
Fig. 1
figure 1

Confirmatory factor analytic model of the BASWBSS. Note. BASWBSS Brief Adolescents’ Subjective Well-Being in School Scale

Given the acceptable fit of the two-factor model for each school level, the evaluation of measurement invariance was continued. Multi-group CFAs were employed to examine the psychometric properties of the BASWBSS across school levels. If there were significant differences between the reference model and the more constrained model (ΔCFI ≥0.010, ΔRMSEA >0.015), the hypothesis of measurement invariance would not be supported. Table 4 reports the fit indices for these invariance tests. There was no statistically significant difference between the configural model and the metric model, but the scalar model was significantly different from the metric model (ΔCFI = 0.023), indicating that scalar invariance was not established. Partial scalar invariance was tested by freely estimating the intercepts of the items. Finally, the intercept of three items of the School Satisfaction subscale (i.e., school management, peer relationships, and teaching) were freely estimated, and partial invariance was achieved (ΔCFI = 0.009).

Table 4 Measurement invariance across school levels and gender (N = 799)

Latent Mean Differences across School Levels

Evidence for partial scalar invariance enabled us to examine latent mean differences. The latent mean of elementary school students was fixed to zero, and the latent means in the remaining groups were freely estimated. In this study, elementary school students served as the reference group. The results showed that elementary school students reported higher scores on school satisfaction than middle school students (ΔM = −0.286, p < 0.01) as well as high school students (ΔM = −0.518, p < 0.001), and middle school students reported higher scores than high school students (ΔM = −0.260, p < 0.05). In terms of affect in school, elementary school students reported higher scores than high school students (ΔM = −0.314, p < 0.05), while there was no statistically significant difference between elementary school and middle school students (ΔM = −0.167, p > 0.05) and between middle school and high school students (ΔM = −0.147, p > 0.05).

Latent Mean Differences across Gender

Previous study has suggested that the BASWBSS shows measurement invariance across gender in adolescents (Tian et al. 2015). In the current study, measurement invariance across gender in the complete sample (including elementary, middle and high school students) was examined. The results are shown in Table 4. Given the support for configural, metric and scalar invariance, comparisons of latent mean differences across gender were possible.

In the comparisons of latent mean differences across gender, boys served as the reference group in the current study. The results showed that there were no statistically significant latent mean differences across gender on the school satisfaction score (ΔM = −0.095, p > 0.05) or on the affect in school score (ΔM = 0.065, p > 0.05).

Discussion

The major aim of the present study was to evaluate cross-school level comparability of the BASWBSS using a sample of Chinese students of ages 9–19 from elementary, middle, and high school. We examined the measurement invariance of the BASWBSS across school levels and compared the latent mean differences among these groups. Evidence of partial scalar invariance across school levels was established when three items were freely estimated (i.e., Items 4, 6, and 7 from the School Satisfaction subscale), enabling meaningful comparisons of the latent means (Cheung and Rensvold 2002). Meanwhile, comparisons of the latent mean differences across gender groups were also conducted as scalar invariance across gender was established.

According to Chen (2007), we examined the noninvariant items before deriving further conclusions. Three intercepts failed to display invariance across school levels. These three items (i.e., items 4, 6, and 7) all belonged to the School Satisfaction subscale and were used to assess three specific domains of school satisfaction. Items 4 and 7 addressed the domains of school management and teaching, respectively. Both items reflected students’ evaluations of the school’s educational resources. Item 6 addressed the domain of peer relationships, reflecting the quality of peer interactions. The obtained non-invariance may be explained by developmental differences. Compared to childhood, cognitive changes during adolescence strengthen the abilities of individuals to think abstractly and consider multiple dimensions of an issue (Papalia et al. 2007). Therefore, when evaluating educational resources, adolescent students may take many factors into account. Moreover, it is possible that students experience differing levels of access to quality educational resources across different school levels. In terms of peer relationships, as researchers have suggested (Fabes et al. 2009), developmental and contextual changes may affect the nature of peer interactions and behaviors. After entering adolescence, children’s play no longer holds the dominant position in adolescent peer interactions. The interactions and activities become more varied, including activities related to music, movies, shopping, and so on. Moreover, the quality of interpersonal communications garners more attention in adolescents (Rubin et al. 2006). These changes may lead to differences in the understanding and meanings associated with these particular items between children and adolescents. Because our study did not provide direct evidence in support of these speculations, further research is needed to determine the quality and meaning of these items for Chinese students of different age levels. For example, multi-method research (e.g., student interviews and BASWBSS self-reports) should be conducted to clarify the reasons for the non-invariance of the items.

Nevertheless, the establishment of partial scalar invariance suggested that meaningful comparisons of latent mean score differences across school levels could be conducted. Statistically significant differences across the three school levels were subsequently found when comparing the latent mean differences of SWB in school. The findings were consistent with our hypothesis, showing that elementary school students reported the highest scores on school satisfaction, while high school students reported the lowest scores. The results were similar to previous studies which suggested that adolescents’ satisfaction decreases with age (e.g., González-Carrasco et al. 2017; Uusitalo-Malmivaara 2014). In terms of affect in school, although the latent means of this factor decreased as the school levels increased, the differences between elementary and middle school students as well as between middle and high school students were nonsignificant. A statistically significant difference was only found between elementary and high school students. These results were slightly different from those found in the study of Liu et al. (2016), which reported that elementary school students reported higher scores on affect in school than middle and high school students whereas there was no statistically significant difference between middle and high school students.

Similar findings were observed in the comparison of gender differences. The results showed that the latent mean differences on the School Satisfaction and Affect in School subscales were not significantly different for boys and girls. These findings were inconsistent with those of previous studies showing that girls reported higher scores on school satisfaction than boys, whereas there was no significant gender difference for affect in school among children or adolescents (Liu et al. 2016; Verkuyten and Thijs 2002). It should be noted that in the current study, the conclusions were derived from comparisons of latent means whereas the results of the aforementioned studies were derived from comparisons of observed means. Significant mean differences may be masked when comparing observed means, whereas the use of latent mean comparisons reduces measurement bias because latent mean differences are calculated by eliminating partial variance attributed to measurement error (Cheung and Rensvold 2000). However, to date, most studies have used observed mean scores for comparisons, without testing invariance across the groups.

Previous researchers have speculated about why SWB in school decreases in adolescence (e.g., Casas and González-Carrasco 2019; Cummins 2014). Some of the decline may be related to developmental factors, such as changes in cognition and emotion (Žukauskienė 2014), and some of the decline may be related to educational and social factors, such as increasing school challenges and concerns over belonging and conforming to peer pressure (Cummins 2014; Uusitalo-Malmivaara 2014). Thus, given that the decline likely reflects both developmental and contextual factors, both factors should be considered in efforts to promote SWB in school. Recommendations for promoting SWB in school, based on such a multi-factored perspective, can be found in Huebner and Diener (2008).

Compared to previous studies examining the measurement invariance of scales that address SWB in school, our study demonstrated several important merits. First, as mentioned above, previous studies have mostly focused on the cognitive component of SWB in school (e.g., Abubakar et al. 2016; Jovanović 2019), whereas our study examined both the cognitive and affective components of SWB in school through the use of the BASWBSS. This advantage of the BASWBSS is in line with researchers’ suggestions that cognitive and affective components should be taken into consideration simultaneously (e.g., Campbell et al. 1976); thus, the BASWBSS offers researchers a brief two-factor scale with good psychometric properties. Second, we examined measurement invariance across all three major school levels (i.e., elementary, middle, and high school), providing support for using the comparability of BASWBSS scores across these school levels. Finally, given that the study addressed SWB in the school domain in particular, the choice of comparing scores across school levels was more consistent with key educational demarcations of development than chronological age levels.

However, there were also some limitations of the current study that should be noted. First, the size of the sample of the high school students, albeit adequate, was relatively small compared to the samples of the elementary and middle school students. Compared to the elementary and middle school students, high school students were busier with their schoolwork and facing great pressure for soon joining the workforce or pursuing advanced professional knowledge in higher education (Zhang et al. 2010), thus it was more difficult to obtain their cooperation. Therefore, efforts should be extended to improve the size and representativeness of high school samples in future studies. Second, although we provided plausible explanations for the non-invariance of several items, we were not able to specify the true causes. Thus, further research is needed to investigate why students from different school levels interpret these items differently. Finally, it should be noted that although evidence of the validity of the Affect in school subscale has been provided, this subscale only consists of two items, and thus this subscale should be interpreted with caution in future studies.

In conclusion, the results of the current study provided important, additional evidence supporting the psychometric properties of the BASWBSS, namely its measurement invariance across school levels. The invariance of the BASWBSS scores across different school levels suggests that the BASWBSS can be used to make meaningful comparisons across different student groups, such as elementary school, middle school, and high school students, providing additional support for its validity with Chinese students. Furthermore, the findings indicated that Chinese elementary school students reported higher SWB in school than their middle and high school levels counterparts based on comparisons of latent means rather than observed means.