Introduction

Formal education, at its heart, is concerned with supporting the development of individuals. This support might be explicit in the form of structuring, instructing and assessing for knowledge acquisition. It might also be implicit in affording opportunities for teacher/peer engagement and personal autonomy.

There is a growing understanding that students engage in their studies for many concurrent reasons (Ryan and Deci 2000). Students may generally find enjoyment and personal value to their studies, or may be motivated to act by rewards, punishments, and other forms of external control. A growing body of evidence from numerous countries worldwide indicates that students who perceive their learning to be of interest and personal benefit demonstrate better learning outcomes (Jang et al. 2012; Koizumi and Matsuo 1993; Soenens and Vansteenkiste 2005; Vansteenkiste et al. 2005). These internally regulated motives further play significant roles in different academic subjects (Chanal and Guay 2015), including foreign languages.

While Japanese students consistently rank at the top of international comparisons of achievement in reading, math, and science (OECD 2009, 2012), they have shown a lack of growth with regard to foreign language proficiency (Education First 2017). In localized terms, foreign language motivation and achievement occupy much of the same space in public discourse in Japan (MEXT 2003) as STEM does in North America and other countries (Bureau of Labor Statistics 2011; OECD 2006). Citing significant pressure from civic, industrial, and educational groups to improve citizens’ English competencies in the face of international competition (MEXT 2003), the Japanese government embarked on an expansion of its national compulsory language learning curricula to include elementary schools (MEXT 2008). One of the major elements of this curriculum is the cultivation of interest, enjoyment, and well-being in each subject, with a special focus on helping students “experience the joy of communication in the foreign language” (MEXT 2008). This program of instruction is theoretically well-matched with the self-determination theory (SDT) of human motivation (Deci and Ryan 1985).

Our aim in this paper is to delineate the different motivational profiles found among Japanese elementary school students, and to then track how these students changed subgroups over the course of 2 years. In order to investigate these subgroups and students’ movement between them, we adopted a longitudinal person-centered approach to analyses. Building on previous research in this area (Corpus and Wormington 2014), we employed latent profile transition analysis (LPTA) measuring student motivation at three time points to describe subgroups and membership change over 2 years. In this longitudinal study, we use the SDT framework for understanding both the quality and quantity of motivation, as indicated by autonomous and controlled motives.

Autonomous and controlled motives

Different theoretical paradigms may treat motivation as a quantitative phenomenon, where more is better, or qualitative, where the reasons behind the action and interaction with the environment are of import and interest. In the former camp, expectancy-value theory (Eccles and Wigfield 2002), self-efficacy theory (Bandura 1997), and some recent theories of dynamic systems (Dörnyei and Ryan 2015) have all indicated that the level and intensity of a single unitary motivational construct is the crucial factor in determining success. In the latter camp, SDT (Deci and Ryan 1985) posits that the quality of motivation may lead to sustained engagement and adaptive outcomes. Accordingly, even a high quantity of motivation may not lead to positive outcomes when the motivation is defined and driven by others, and not by the person acting.

SDT separates motivation into a broad continuum from controlled, or originating outside of the person, to autonomous, originating from within. These are further separated into a series of subcomponent regulations that then define the reasons behind a person’s actions. Autonomous motivation is comprised of intrinsic regulation, a desire to act for the enjoyment or satisfaction of the task, and identified regulation, where individuals act to achieve personally valued instrumental outcomes. When students engage in their schoolwork out of enjoyment, curiosity, and a desire to succeed, we can say that they are acting autonomously.

Prior studies have shown numerous positive outcomes associated with autonomous motivation. Students with an internal locus of causality (Deci and Ryan 1985; Ryan and Deci 2017) use more adaptive meta-cognitive strategies such as appropriate time management and planning (Vansteenkiste et al. 2005). They show greater persistence (Hardré and Reeve 2003; McEown et al. 2014), greater interest in the domain of study (Fryer et al. 2014), and procrastinate less often (Senécal et al. 2003). They use more deep-level processing strategies (Grolnick and Ryan 1987; Vansteenkiste et al. 2004; Fryer et al. 2014). Finally, students who feel a sense of ownership over their learning show greater course achievement (Soenens and Vansteenkiste 2005). Studies have also shown autonomous motives to be positive predictors of engagement (Oga-Baldwin et al. 2017; Jang 2008), while others have shown that engagement likewise predicts autonomous motivation (Oga-Baldwin and Nakata 2017).

Another part of the self-determination continuum, controlled motivation refers to motives stemming from either internal or external pressure. In this scenario, students’ locus of causality is outside of their control, their motivation is contingent on stimuli from the surrounding environment, the people in it, or feelings of negative internal pressure. Introjected and external regulations comprise controlled motivation, stemming from pressure and compulsion rather than volition. Introjected regulation describes when students feel a sense of shame, guilt, or other social or non-volitional internal pressure to act. Under external regulation, students engage with their studies to avoid punishment and receive praise or rewards. In education, the components of controlled motivation represent studying not from a desire to learn, but primarily from a lack of choice or a pressure to perform.

As noted, these motivations do not occur in isolation; all student behaviors are regulated for both controlled and autonomous reasons. When compared with students with high autonomous motivation, students with higher controlled motivation show poor concentration and time management and increased anxiety and procrastination (Senécal et al. 2003; Vansteenkiste et al. 2005). These students use more surface-level approaches to learning (Vansteenkiste et al. 2004; Fryer et al. 2014) and ultimately display lower achievement (Soenens and Vansteenkiste 2005). Some recent research (Graves et al. 2015; Howard et al. 2016) has indicated that controlled motives may have positive relationships with more desirable outcomes when combined with matching levels of autonomous motivation. When looking at the nature of students’ individual motivation, it is thus necessary to build models using both autonomous and controlled motives.

Motivational profiles

Research in the SDT tradition has approached the study of controlled and autonomous motivation to look at both how personal and environmental trends may promote these factors over time. Statistical methods such as structural equation modeling have been used to understand the longitudinal relationships between latent variables to show the individual predictive effects of single variables (e.g., Jang et al. 2012). These studies have the advantage of illustrating how theoretical constructs interact and the effects they may have on future behaviors, achievement, or other outcomes. Person-centered analyses can then be used to complement the findings of variable-centered work by demonstrating how subgroups form based on the above mentioned variables. Person-centered analyses can explore the motivational profiles of salient subgroups based on a configuration of interacting variables (Vansteenkiste et al. 2009). In the current study, we adopt the latter approach, testing the theoretical and practical issue of how the quality and quantity of individuals’ motivation may change over time.

In practical terms, students’ motivational profiles at each time point may illustrate their potential engagement in school and learning (Vansteenkiste et al. 2009). When students enter school displaying a specific profile, they may have a higher likelihood of maintaining that profile or altering course toward a different one as a result of the interaction between their schooling experiences and their personal motives. Based on the theory that motivation develops as a partial product of the school environment (Reeve 2012), profiles might offer diagnostic evidence of what is and is not working in a particular school setting. These profiles can then further be used to measure the efficacy of specific motivational interventions and programs. The covariates of each profile (e.g., engagement, achievement) can also be used to investigate potential reasons why individual students’ profile might change during the course of their studies.

Past person-centered studies have found a range of outcomes regarding the number of subgroups that might result among students at different ages and within different contexts (e.g., Corpus and Wormington 2014; Ratelle et al. 2007). While there has been some variation in the results and labeling of the profiles, past studies have tended to show four theoretically consistent profiles of motivation (e.g., Hayenga and Corpus 2010; Vansteenkiste et al. 2009; Wormington et al. 2012). The first profile is “Low Quantity motivation,” where students have both low autonomous and low controlled motives. The second profile is “Poor Quality motivation,” characterized by comparatively higher controlled motivation and comparatively lower autonomous motivation. The third profile is “Good Quality motivation,” represented by higher autonomous motivation and lower controlled motivation. Finally, “High Quantity motivation” is represented by simultaneously high ratings on both autonomous and controlled motivation. While other studies have used a range of terminology to represent these and similar constructs (e.g., Corpus and Wormington 2014; Gillet et al. 2017; etc.), we have adopted the terminology used by Vansteenkiste et al. (2009) to maintain theoretical consistency. According to SDT, profiles with higher autonomous motivation are more likely to show sustainable positive outcomes, while those higher in controlled motivation are more often associated with negative attitudes, behaviors, and achievement (Ryan and Deci 2017).

In one of the first papers on motivational profiles, Ratelle et al. (2007) reported on three studies, two in high school and one in university, all three undertaken in French Canada. All studies used cluster analyses to investigate profiles, and found three profiles in each sample. In the first two studies, they found evidence for a High Quantity and Low Quality subgroup, as well as a subgroup that was moderately high on autonomous and controlled motivation. In the sample of university students, the moderate subgroup was replaced by one resembling Good Quality motivation, with high autonomous and low controlled motives. In each sample, students with the highest degree of autonomous motivation showed the most adaptive outcomes. Female students showed slightly higher autonomous motives than males.

In a later series of studies involving Belgian high schools and universities, Vansteenkiste et al. (2009) consistently found the theorized four-profile pattern. Using cluster analysis to look at both samples, the authors found that Good Quality motivation was associated with more adaptive behaviors such as effective time and environment use, use of meta-cognitive strategies, better effort regulation, and higher GPA. Likewise, students with Poor Quality motivation were more likely to report cheating, feel that cheating is acceptable, procrastinate, and show a lower GPA than students in the other profiles. Girls also showed more autonomous motivation and adaptive outcomes than boys.

Studies involving secondary school students in the United States also showed the same four-profile pattern (Hayenga and Corpus 2010; Wormington et al. 2012). As with the work by Vansteenkiste et al. (2009), cluster analysis indicated four student profiles of Good Quality, High Quantity, Poor Quality, and Low Quantity motivations. Among junior high school students (Hayenga and Corpus 2010), the highest GPAs were associated with Good Quality motivation, while the lowest were represented by Poor Quality motivation. Results further showed general within-subject stability across the year for each of the four profiles. While a small number of students did improve, most students who changed profiles (movers) went toward more controlled motivations. In high school (Wormington et al. 2012), students’ achievement was similarly associated with both High Quantity and Good Quality motivation. However, students with higher quantity motivation showed different patterns of participation in extracurricular activities.

Recent work examining Singaporean students’ motivation for physical education (Wang et al. 2016) and mathematics (Wang et al. 2017) again showed different, though related, configurations of motivation. In both studies, latent profile analysis (LPA) was used to preserve the underlying complexity of the data. Looking at primary and secondary school students’ motivation for physical education (Wang et al. 2016), students showed a total of five profiles: three similar to the Poor Quality, High Quantity, and Good Quality subgroups, but also two additional subgroups, one with moderate levels of both autonomous and controlled motivation, and another resembling a more moderate Poor Quality subgroup, with slightly lower controlled and slightly higher autonomous motivation.

In the study on motivation for mathematics (Wang et al. 2017), secondary school students showed four profiles, two with patterns similar to Poor and Good Quality motivation, but also showed an additional Poor Quality-like subgroup, with moderately low controlled motivation and low autonomous motivation. They also found a subgroup with low intrinsic regulation, but high identified and external regulation. In both studies, the Good Quality-like profiles displayed the most effort towards and highest feelings of competence for the specified domains. No significant gender differences were detected.

Most recently, Gillet et al. (2017) demonstrated the most fine-grained differentiations in motivational profiles in a sample of French-Canadian university students. This study offered one of the first uses of LPTA to look at student motivation at two time points, and thus offers a comparison to the current study. This study showed a range of six motivational profiles: two of which were roughly contiguous with Good Quality motivation, one of which corresponded to High Quantity motivation, two which were roughly analogous with Low Quantity motivation, and one similar to Low Quality motivation. Consistent with previous findings (e.g., Vansteenkiste et al. 2009), the more autonomously motivated profiles showed more adaptive outcomes, including more positive affect for school, interest, effort, and achievement, and lower levels of boredom, disorganization, and intentions to dropout of university. Likewise, the controlled and poorly motivated profiles showed greater boredom, disorganization, and intention to dropout, with accompanying lower levels of achievement.

Most similar to the current sample and study, Corpus and Wormington (2014) found some of the same profiles as other studies in a sample of elementary school students, namely a Poor and Good Quality motivation profile, as well as a High Quantity motivation profile. This study found no Low Quantity motivation subgroup, which the authors hypothesized as related to the structure of elementary schools. Conducting cross-sectional cluster analysis with the same cohort at two time points, the study traced subgroup membership over the course of a single school year. The Good Quality motivational subgroup (labeled in this study as “primarily intrinsic”) showed the greatest stability, with 76% of students remaining in the cluster over the course of the year. Students in the High Quantity motivation subgroup showed the least stability, with only 45% retaining the same level of motivation. The Poor Quality motivation subgroup (labeled as “primarily extrinsic”) showed fewer changes, with nearly 65% of students reporting the same motivation in the spring as the fall. The more autonomously motivated Good Quality profile students further showed higher grades and scores on standardized tests. No significant differences were found in terms of gender in the three subgroups. It follows then that in elementary schools across cultures, learners may show a greater tendency towards Good Quality motivation; pilot studies in Japanese elementary schools revealed the same three profile patterns using cluster analyses (Oga-Baldwin and Fryer 2017).

Research to this point, therefore, suggests that the number and nature of a sample’s subgroups may be related to some combination of context and age. Students in secondary and tertiary contexts have at different times demonstrated a range of potential profiles (Hayenga and Corpus 2010; Ratelle et al. 2007; Vansteenkiste et al. 2009; Wang et al. 2016, 2017; Fryer et al. 2016). At the same time, elementary students have shown three profiles (Corpus and Wormington 2014; Oga-Baldwin and Fryer 2017). In almost all of the studies discussed, students showed a version of the Good Quality (higher autonomous and lower controlled motivation), High Quantity (similar levels of both autonomous and controlled motivation), and Poor Quality (lower autonomous and higher controlled motivation) profiles.

Based on these previous patterns, different developmental and social factors associated with each level of education may be at work. In secondary and tertiary education, students may display patterns of highs and lows in quality and quantity of motivation for their studies related to the greater degree of freedom over their studies. These students may also have a more mature understanding for the reasons behind their studies (Alexander 2003). At the same time, elementary school in many countries does not have the same life-defining stakes, and fear of failure may be less of an avoidance inducing motivator (Cave 2007; Covington 1992; Lehtinen et al. 1995; Lewis 1995; Meece and Holt 1993). As in the study by Corpus and Wormington (2014), students may still feel a sense of curiosity and enjoyment in learning related to individual and contextual factors, such as students’ age and daily relationship with their teacher.

Elementary education in Japan

According to ethnographic and observational studies of elementary schools (Cave 2007; Lewis 1995), a central focus of elementary school in Japan is to help individual pupils develop as responsible members of society. Education at this level works to educate the whole person, and includes strong provisions for developing independence and ability within a sense of community. Students learn and interact through set rituals such as school cleaning and serving lunch for one another in order to develop the school community. Teachers spend considerable time building basic skills in arithmetic and literacy, working on both rote and conceptual learning (Cave 2007). Consequently, teachers keep students engaged behaviorally, emotionally, and cognitively engaged through daily routines and positive interpersonal relations.

In comparison to elementary school education, studies have indicated that students in secondary schools may struggle with motivation (Toyama 2007), especially for foreign language (Koizumi and Matsuo 1993; Sakai and Kikuchi 2009). Japanese secondary schools have often been associated with more controlled motivation (Berwick and Ross 1989; Hiromori 2003). In some cases, this decrease in the quality of motivation may begin as early as the end of primary school (Carreira 2012). During this period, students are expected to spend more time preparing for the more testing-oriented environment in secondary schools (Cave 2007), and thus may begin to suffer under the same external pressure from teachers and parents found in other situations (Ryan and Niemiec 2009). Educational surveys involving primary and secondary school students show sharp motivational declines in the context of foreign languages. These surveys consistently indicate a lack of confidence in and desire to learn English compared to other subjects, with the gap widening as students enter secondary school (Benesse Educational Research Development Center 2011).

While students perceive great difficulties in learning a foreign language, the Japanese Ministry of Education, Culture, Sports, Science, and Technology (MEXT) has placed a strong emphasis on English as a tool for communication in a global society in order to remain internationally competitive (MEXT 2003). This Ministry has aimed at addressing the previously noted motivation and achievement problems through changes to the national Course of Study, the policy syllabus for all public and officially recognized private schools. The most recent version (MEXT 2008) puts stronger emphasis on the motivational aspects of learning, considering students’ adaptive interest, attitudes, and behaviors as important outcomes in the learning process. Based on this policy, schools and teachers are expected to promote a sense of autonomous motivation to learn with the ultimate goal of helping students to become lifelong learners (MEXT 2008; Oga-Baldwin and Nakata 2014). Accordingly, while other countries in Asia include standard and formal assessments (Butler 2015), Japanese upper elementary students studying foreign language do so without assessments, rewards, or other externally regulated controls on their behavior. In a low-stakes environment such as this, students may be expected to develop autonomous motivation and positive affect for their learning (Reeve and Assor 2011; Ryan and Niemiec 2009).

Prior variable-centered analyses have indicated that this learning environment may indeed have the intended positive benefits (Oga-Baldwin et al. 2017). However, this work has not examined how individual students might change over time. Education should be understood as a process for providing change. In the best of cases, this change should be for the better, and move students toward more adaptive, more autonomous motivation (Reeve and Assor 2011). Person-centered analyses may offer an enhanced understanding of students’ individual motivational profiles. However, beyond simply finding profiles, we hope to indicate how students move between these profiles over the course of 2 years of schooling. Thus, a better understanding of how student populations change over time may offer a diagnostic for what is and is not working in schools. LPTA offers an opportunity to see those changes without reducing the data through reliance on mean (change) difference testing. Using this methodology, we hope to show how students grow through their final 2 years of elementary school within a highly engaging, low-stakes learning environment (Oga-Baldwin and Nakata, under review).

The current study

In the current study, we worked from the following four hypotheses:

  1. 1.

    Students in Japanese elementary schools will display the same three profile patterns as those found in the work by Corpus and Wormington (2014): Higher autonomous and lower controlled motivation (Good Quality), similar levels of autonomous and controlled motivations (High Quantity), and higher controlled and lower autonomous (Poor Quality) motivation.

  2. 2.

    Consistent with Corpus and Wormington (2014), we expect the more adaptive subgroup (Good Quality) to be the most stable over the 2-year period of the study.

  3. 3.

    Consistent with the efforts of the national government to create an intrinsically motivating learning environment, we predict a pattern of transitions towards increasing student membership within the more motivationally adaptive subgroups.

  4. 4.

    In line with research on engagement (e.g., Jang et al. 2009, 2012, 2016; Reeve and Lee 2014; Skinner et al. 2008; Oga-Baldwin and Nakata 2017; Oga-Baldwin et al. 2017), we expect students moving toward better quality motivation to report higher engagement than their more extrinsically oriented peers.

Method

Participants

Students were sampled from a suburban-rural city with a population of roughly 100,000 in Western Japan. Public documents indicate the town as middle class, with individual earnings roughly at the national average (Japan Statistics Bureau 2016). All students were ethnically Japanese.

Five hundred and thirteen students (female n = 254, gender unknown = 5) at seven public elementary schools agreed to participate with the signed permission of their parents, teachers, principals, and the board of education. All students in the participating schools granted consent. Students were all in the 5th grade at the start of the research (10–11 years old) and completed 6th grade at the end (12 years old). Students were assigned to 16 homeroom classes, each with an attached teacher. Ethical permission for the research was approved by the Fukuoka University of Education Ethics Review Board.

This study represents an extension of the variable-centered study previously completed by the first author (Oga-Baldwin et al. 2017). The previous study made use of only the first year of this data and sample within a longitudinal structural equation framework. In a follow-up study (Oga-Baldwin and Nakata, under review), the qualitative factors relating to students’ motivation and engagement were investigated, indicating that while the majority of teachers maintained the low-stakes experiential learning environment, a number of autonomy supportive practices coincided with more positive student engagement and motivation. The current study aims to deepen understanding of how a large representative sample of Japanese students develops motivation for a specific school subject (i.e., foreign language) over time as individuals.

Measures

Motivation

Motivation was measured using a Japanese translation of the academic self-regulation questionnaire (SRQ-A; Ryan and Connell 1989; see also; Carreira 2012; Yamauchi and Tanaka 1998). This survey is designed to measure the quality of students’ motivation according to SDT’s organismic integration theory continuum from intrinsic to external regulation of motivation using 12 items to represent the four factors. Scales were designed to measure intrinsic, identified, introjected, and external regulations. Scales were Likert-type and ranged from one (“< 50% true for me”) to five (“> 90% true for me”). In line with current Japanese policy on education (MEXT 2008), quality of motivation may be considered an important non-cognitive outcome of schooling (Moore et al. 2015). Students completed these surveys in April 2013, March 2014, and March 2015. Internal reliability for all scales was acceptable at all three time points (“all Cronbachr me > .70”; Devellis 2012). We used the intrinsic and external regulation scales to derive the profiles of students’ motivation to learn English in elementary schools.

Engagement

Recognizing the dynamic and reciprocal nature of engagement and motivation (Oga-Baldwin et al. 2017; Jang et al. 2012; Reeve and Lee 2014), we treated engagement as a covariate of each profile, measured midway during each school year, in October 2013 and 2014. We used a 10-item scale to investigate behavioral, cognitive, and emotional engagement, constrained onto a single latent factor as in previous studies (Jang et al. 2012, 2016; Reeve and Lee 2014). Students completed these surveys in the Fall semester of 2013 and 2014, respectively. These scales have shown good correlations with external observers’ ratings in variable-centered studies (Oga-Baldwin and Nakata 2017; Oga-Baldwin et al. 2017). Engagement was measured to test for differences in how students in different motivational profiles interact with their learning environment, based on the theory that engagement is a reciprocal predictor of motivation (Reeve and Lee 2014). We used the same 5-point Likert-type scales, ranging from one (“< 50% true for me”) to five (“> 90% true for me”). Internal reliabilities for these measures at both time points were acceptable (Cronbach’s α2013 = .89, Cronbach’s α2014 = .90).

Sample items for all scales used are presented in Table 1.

Table 1 Example items, selected from strongest loading items for each factor

Research design

This research used a cohort design, following 513 students across 2 years of upper elementary school. Figure 1 presents the research and sampling design for each of the instruments used.

Fig. 1
figure 1

Research design for the current study

Students were asked to complete the academic SRQ-A to measure autonomous and controlled motivation. After 6 months, students completed a survey on their in-class engagement. At the end of the first school year, students again took the SRQ-A. In the second year of the study, students responded to the engagement questionnaires in the fall, 1 year after the first engagement survey, and took the SRQ-A a final time at the end of the school year. For modeling purposes, profiles were derived from autonomous and controlled motives measured by the SRQ-A; only SDT variables were used in order to isolate the motivational regulations from other constructs. Recognizing that engagement might be a reciprocal predictor of motivation and catalyst for change (Oga-Baldwin and Nakata 2017; Oga-Baldwin et al. 2017; Reeve and Lee 2014), this variable was treated as a covariate to look at differences in how movements within and between profiles might predict classroom behaviors.

Analyses

In the current study all latent analyses were undertaken employing Mplus 7.2 (Muthén and Muthén 1998–2015). Analysis of observed variables was completed with JMP 9.01 (SAS 2007–2011). To establish the convergent and divergent validity of the constructs under examination, joint Confirmatory Factor Analysis (CFA) was undertaken for each measurement point. A four-factor structure (intrinsic–identified–introjected–external) was hypothesized. Following construct validation, analyses proceeded with invariance testing across the three measurements, investigating the metric invariance, scalar invariance, residual variance invariance, and factor variance invariance. Tests were conducted to demonstrate the theoretically similar functioning of the instruments over time. Fit was confirmed using standard cutoffs (Kline 2011) for the χ2 test, Root Mean Square Error of Approximation (RMSEA), and Comparative Fit Index (CFI).

Missing data due to non-response and absence were 3.3% of the total volume of data. These missing data were handled using full-information maximum likelihood estimation in MPlus. This is generally held to be the appropriate means of accounting for small amounts of missing data consistent with the present data set (Schafer and Graham 2002).

For all person-centered analysis, only the intrinsic and extrinsic scales were utilized. Approaches to analyses (cross-sectional and longitudinal) and interpretation of fit indices relied on Nylund and colleagues’ established practices in this area (Nylund 2007; Nylund et al. 2007; Nylund-Gibson et al. 2014) The examination of latent subgroups began with cross-sectional Latent Profile Analyses to indicate subgroup solutions at each time point. LPA refers to latent variable mixture analysis (Magidson and Vermunt 2004) when only continuous clustering indicators are utilized. LPA was followed by LPTA of the full longitudinal data set. We tested differences in the profiles using multivariate analysis of variance (MANOVA) and univariate analysis of variance (ANOVA) for profiles at each time point.

LPTA is the longitudinal extension of LPA. LPTA integrates auto-regressive (a variable predicting itself in the future) modeling to examine subgroup membership over time (Nylund et al. 2007). In contrast to the more commonly utilized K-mean approaches to longitudinal person-centered analysis, LPTA can simultaneously estimate subgroup membership at multiple time points and the transition between these subgroups between time points. LPTA can thereby estimate where students start (their initial subgroup profile) at the beginning of their fifth school year and then provide the same information at the end of their sixth year. Finally, LPTA maps how students move between these subgroups, providing probability estimates of both subgroup memberships and transitions.

At each measurement time, two through five latent subgroups were tested and compared using LPA. Fit to the sample was estimated with Information Criterion, LogLikelihood tests, relevant theory, past empirical findings and subgroup size. For LPAs at the three measurement points, two Likelihood Ratio tests and three Information Criterion indices were employed. We then examined the sample for within-subject changes, looking at those who remained in the same profile (“stayers”) and those who changed profiles (“movers”).

For the LPTA “Mover–Stayer” model (Langeheine and van de Pol 2002), only the three criterion indices were available using the standard specification method for assessing subgroups and transitions across the three time periods (Nylund 2007; Nylund et al. 2007; Nylund-Gibson et al. 2014)Footnote 1. For the Likelihood Ratio Tests, the Vuong–Lo–Mendell–Rubin Likelihood Ratio Test (Vuong 1989) and Lo–Mendell–Rubin Likelihood Ratio Test Criterion (Lo et al. 2001) both provide a test of whether the identified set of latent subgroups was less statistically significant than a solution with one group less, that is, whether the solution with one group less was a better fit for the data. For the information criterion, Akaike’s Information Criterion (Akaike 1987), the Bayesian Information Criterion (BIC; Schwartz 1978) and the sample size-adjusted BIC model are each selection criterion, wherein lower values indicate the preferred model. While all three information criterion have their weaknesses, the BIC is generally seen as being the most useful information criterion guide for person-centered latent analyses (Nylund-Gibson et al. 2014).

Results

The results for this study are presented beginning with construct validation and invariance testing first followed by a brief overview of correlations and descriptive statistics. The person-centered results begin with cross-sectional LPA of each time point, followed by a series of LPTAs to finalize the best fitting Mover–Stayer model.

Construct validation and invariance over time

CFA was initially used to test the longitudinal invariance of the factor structure over time for the main motivation regulation variables. Individual confirmatory factor analyses using robust maximum likelihood showed good fit for four factors at each time point, Time 1 (T1): χ2 (48) = 107.586, p < .001, RMSEA = .050 [confidence interval (CI) = .037, .062], CFI = .96; Time 2 (T2): χ2 (48) = 78.256, p < .001, RMSEA = .035 [CI = .020, .049], CFI = .98; Time 3 (T3): χ2 (48) = 73.448, p < .001, RMSEA = .033 [CI = .016, .048], CFI = .99.

A longitudinal invariance test was conducted using five models: a configural model, a metric invariance model, a scalar model, a residual variance invariance model, and a factor covariance model. Acceptable fit for these models (Kline 2011) indicate that the instruments functioned similarly across the entire period studied, verifying the reliability of the profiles. A factor mean invariance model was not tested, as we intended to look for changes in the subgroup scores, and thus expected changes in the factor means. The four factors (intrinsic, identified, introjected, and external regulations) were treated as separate at each time point. In every model, factors were allowed to correlate with the same factor at each time point (e.g., intrinsic time1  ⇔ intrinsic time2 ) and with other factors measured at the same time (e.g., intrinsic time1  ⇔ identified time1 ), but not across factors across time (i.e., no cross-lagged correlations such as intrinsic time1  ⇔ identified time2 ). Error terms for each item and each factor were correlated across each time point to account for wording artifacts. The configural model showed acceptable fit, χ2 (540) = 1282.619, p < .001, RMSEA = .052 [CI = .048, .055], CFI = .90, indicating that each of the individual factors generalize over time. A metric invariance model was then tested, holding all factor loadings as equal across each time point. This model also showed acceptable fit, χ2 (554) = 1180.167, p < .001, RMSEA = .047 [CI = .043, .051], CFI = .91, with significant improvements over the configural model, Satorra–Bentler χ2 (14) = 99.591, p < .001. This indicated that the factor loadings could be assumed to be similar at each point. We then tested the scalar invariance model, constraining the intercepts to be equal over time. This model showed slightly weaker fit, χ2 (566) = 1236.097, p < .001, RMSEA = .048 [CI = .044, .052], CFI = .90, Satorra–Bentler versus metric invariance v2 (12) = 63.595, p < .001. While fit was not as good as the metric invariance model, it was acceptable, indicating that the intercepts for like items were also similar across the three time points. We then tested the invariance of the residuals, constraining them to be equal as well. This model again showed very similar fit, χ2 (581) = 1236.281, p < .001, RMSEA = .047 [CI = .043, .051], CFI = .90, Satorra–Bentler versus scalar invariance v2 (15) = 11.586, p < .710, indicating the items to be reliable across the three time points. Finally, the factor covariance model showed acceptable fit, χ2 (587) = 1275.768, p < .001, RMSEA = .048 [CI = .044, .053], CFI = .90, Satorra–Bentler versus residual variance invariance χ2 (6) = 11.586, p < .001, indicating that the covarying constructs were equivalent over time. The above tests indicated that the constructs functioned similarly at each of the time points, allowing us to complete the person-centered investigation of how students move between motivational profiles over time based on these constructs.

Descriptive findings

The correlation between all modeled variables, the reliability of all scales and the descriptive statistics is presented in Table 2. Regarding mean level changes, a number small differences were observed over time: intrinsic regulation increased (p < .0001, F = 14.99, R2 = .02); external regulation decreased (p < .0001, F = 2.51, R2 = .03); introjected regulation decreased (p < .0001, F = 13.44, R2 = .02); identified regulation did not change; engagement increased (p < .01, F = 8.4, R2 = .01). Correlations across the variables were consistent with past research in this area (Oga-Baldwin and Nakata 2017; Oga-Baldwin et al. 2017). Consistent with many of the other studies surveyed, male gender had a weak negative correlation with intrinsic motivation.

Table 2 Correlations and descriptive statistics for all observed variables

Person-centered results

Profile analysis

LPA at each of the three measurement times was conducted with intrinsic and extrinsic motivation. For each of the LPAs two through five subgroup solutions were tested. For each LPA Information Criterion, Likelihood Ratio Tests, subgroup size and theory were examined and reviewed to establish the best solution (Table 3). For Time 1 (T1; spring 2013) and Time 3 (T3; spring 2015) BIC (generally the most informative information criterion; Nylund 2007; Nylund-Gibson et al. 2014) indicated three subgroups. The three subgroup solution was supported by Likelihood Ratio Test at T1 but not T3. Theory (three clearly theoretically discernable profiles) and subgroup size (> 5%) also supported the three subgroup solution. At Time 2 (T2; spring 2014) Information Criteria were not informative and Likelihood Ratio Tests suggested a two-subgroup solution which did not present theoretically meaningful profiles. Three subgroups presented theoretically meaningful and consistent (with T1 and T3) profiles. Given the lack of clear direction from the statistical indices, three subgroups were selected as the best possible solution.

Table 3 Fit for Time 1, 2, and 3 LPA (2–5 subgroups)

The Mover–Stayer model was then tested with two through four subgroups. We tested the different subgroup models in order to establish the validity of the three subgroup solution resolved through cross-sectional analyses across the 2-year study. For the LPTAs, only information criteria were available to support subgroup solution decision. As a result, BIC, subgroup size and theory were relied on for solution decisions. BIC presented very clear support for three subgroups (clear minimum at three subgroups). Furthermore, theoretically consistent profiles, along with reasonable subgroup size, supported this result (Table 4). The LPTA information criteria clearly supported the choice of three subgroups for each data point supporting the cross-sectional subgroup solution decisions.

Table 4 Fit for LPTA

The profiles for the three subgroups best represented Poor Quality (comparatively low quantity of autonomous motivation compared to high controlled motivation), High Quantity (comparatively high quantity of both autonomous and controlled motivation) and Good Quality (comparatively higher autonomous motivation than controlled motivation), supporting hypothesis one. The composition of each of these subgroups is visually represented in Fig. 2.

Fig. 2
figure 2

Profiles for each subgroup at all three measurements. PQ Poor Quality, HQ High Quantity, GQ Good Quality, T1 Time 1 (spring, 2013), T2 Time 2 (spring, 2014), and T3 Time 3 (spring, 2015)

MANOVA and ANOVA results

We conducted multivariate analysis of variance (MANOVA) tests including intrinsic, identified, introjected, and external regulations at times T1, T2, and T3. MANOVA tests showed differences at measurement T1 (Wilks’ Lambda = .37, DF = 6, F = 80.23, p < .0001), T2 (Wilks’ Lambda = .35, DF = 6, F = 85.60, p < .0001), and T3 (Wilks’ Lambda = .32, DF = 6, F = 118.93, p < .0001), accounting for 63%, 65%, and 68% of the profile variables’ variance at T1, T2, and T3 respectively. These tests confirm the differences for each of the measured variables based on the profile subgroups.

The nature of the subgroups was stable across the three measurement points (T1, T2, and T3). Table 5 presents the difference testing results across three subgroups and three measurement points. Differences were observed for the two profiled variables at all three time points (p < .001; R2 = .20–.74). Identified regulation was also found to vary in a manner consistent with theory across all three subgroups (p < .001; R2 = .23–.50). Introjected regulation showed only slight variation (p < .001–.05; R2 = .01–.04). These differences further confirm the varying compositions of each profile in line with previous theoretical positions (Vansteenkiste et al. 2009).

Table 5 ANOVA results for all variables across the measurement points T1, T2 and T3, separated by profile

An examination of the profiles for the first two measurement points (T1 and T2) present a gradual increase in engagement from least adaptive to most adaptive subgroup: i.e., Poor Quality, High Quantity, and then Good Quality. Analysis of variance (ANOVA) of profiles taken at Time A (fall, 2013) and Time B (fall, 2014) measurement also showed moderate differences in engagement across the subgroups (p < .0001, R2 = .14, .20).

Mover–Stayer analysis

Across the three measurements, the Mover–Stayer model presents a pattern of students increasingly joining the High Quantity (net from Poor Quality to High Quantity n = 42 across 2 years) and Good Quality (net from High Quantity to Good Quality:n = 35 across 2 years) motivational subgroups over 2 years of English education (see Fig. 3). We will now present the within-subject stability of the subgroups in term of the “stayers” (those who remained in the same profile from one point to the next) and the within-subject variability in terms of the “movers” (those who changed to a different profile; Langeheine and van de Pol 2002).

Fig. 3
figure 3

Elementary school students’ latent transitions between three subgroups and across 2 years. Bold percentages represent the stability of the profiles over time

The stability of subgroups across the 2 years varied and presented a pattern of movement toward membership in more autonomously regulated profiles and away from controlled motivation. The most unstable was the Poor Quality subgroup (46% stayers T1–T2) and this instability increased steadily over the T2–T3 transitions (33%) and overall was very unstable (16% across 2 years). The High Quantity subgroup started substantially more stable (89% stayer T1–T2), though this stability decreased between T2–T3 (72%), presenting some instability over the entire 2-year study (62% stayers across 2 years). The Good Quality subgroup was both the most stable to start (92% stayer T1–T2) and remained consistent at T2–T3 (92% stayers). Good Quality therefore presented the most stable subgroup across the three measurement points (83% stayers over 2 years). The stability of the Good Quality profile supported hypothesis two.

Though a minority in terms of the overall model, the movers generally showed greater movement toward more autonomous motives. From T1 to T2, 32 students moved from the Poor Quality to the High Quantity subgroup, while 21 students moved from High Quantity to Good Quality. Moving in the opposite direction, 16 students moved from Good Quality to High Quantity, seven students moved from Good Quality to Poor Quality, and one moved from High Quantity to Poor Quality. From T2 to T3, 19 students moved from moved from the Poor Quality to the High Quantity subgroup, four students moved from Poor Quality to Good Quality, and 52 students moved from High Quantity to Good Quality. At the same time, 22 students moved from the Good Quality profile to the High Quantity profile, while eight students moved from High Quantity to Poor Quality. Results are consistent with hypothesis three, that students would generally move toward more autonomous motivation.

Students’ engagement for Mover–Stayer subgroups’ profiles at the two transitions (T1–T2, T2–T3), as well as the overall change across the 2 years (T1–T3) are presented in Table 6, and depicted in Figs. 4, 5, 6. Across all profiles, a clear pattern of engagement can be observed at both Time A and Time B. Poor Quality stayers, High Quantity stayers, and Good Quality stayers showed a respectively increasing pattern of engagement. In support of hypothesis four, which stated that students who moved toward a more internally regulated profile also generally showed higher levels of engagement, while students moving toward more external control showed comparatively lower levels of engagement. No specific pattern for gender was noted among the movers and stayers (No significant Chi square difference at p < .01).

Table 6 Engagement differences by movement profile
Fig. 4
figure 4

Engagement for each Mover–Stayer profile, Time 1 to Time 2. P–P Poor Quality to Poor Quality (Stayer), P–H Poor Quality to High Quantity (Mover), P–Q Poor Quality to Good Quality (Mover), H–P High Quantity to Poor Quality (Mover), H–H High Quantity to High Quantity (Stayer), H–Q High Quantity to Good Quality (Mover), G–P Good Quality to Poor Quality (Mover), G–H Good Quality to High Quantity (Mover), G–G Good Quality to Good Quality (Stayer)

Fig. 5
figure 5

Engagement for each Mover–Stayer profile, Time 2 to Time 3. P–P Poor Quality to Poor Quality (Stayer), P–H Poor Quality to High Quantity (Mover), P–Q Poor Quality to Good Quality (Mover), H–P High Quantity to Poor Quality (Mover), H–H High Quantity to High Quantity (Stayer), H–Q High Quantity to Good Quality (Mover), G–P Good Quality to Poor Quality (Mover), G–H Good Quality to High Quantity (Mover), G–G Good Quality to Good Quality (Stayer)

Fig. 6
figure 6

Engagement for each Mover–Stayer profile, Time 1 to Time 3. P–P Poor Quality to Poor Quality (Stayer), P–H Poor Quality to High Quantity (Mover), P–Q Poor Quality to Good Quality (Mover), H–P High Quantity to Poor Quality (Mover), H–H High Quantity to High Quantity (Stayer), H–Q High Quantity to Good Quality (Mover), G–P Good Quality to Poor Quality (Mover), G–H Good Quality to High Quantity (Mover), G–G Good Quality to Good Quality (Stayer)

Discussion

In this study we hypothesized that students in Japanese elementary schools would display the same three profile patterns as those found in the work by Corpus and Wormington (2014): Primarily autonomous motivation (Good Quality), similar levels of autonomous and controlled motivation (High Quantity), and primarily controlled motivation (Poor Quality). Results supported this hypothesis. Also consistent with Corpus and Wormington (2014), we expected the more adaptive subgroup to be the most stable over the 2-year period of the study. Finally, we predicted that consistent with the efforts of the national government, a pattern of transitions towards increasing student membership within the more motivationally adaptive subgroups would emerge.

Expanding on previous variable-based understandings of longitudinal motivational development (Jang et al. 2012, 2016; Reeve and Lee 2014; Oga-Baldwin et al. 2017), LPTA was conducted on autonomous and controlled motivation self-reports from one cohort of fifth- and sixth-grade Japanese elementary school students studying English as a foreign language. Student engagement was measured twice: at Time A between the T1 and T2 and Time B between the T2 and T3 motivational measures. Engagement was related to higher quality motivation, and further showed a pattern of increasing among students who stayed in or moved toward a more autonomously motivated profile. This corroborates previous findings of a positive dynamic relationship between more autonomous motives and engagement (Oga-Baldwin et al. 2017), and supports our hypothesis that engagement would be more strongly associated with the more autonomous profiles.

Consistent with Corpus and Wormington (2014), three reliable subgroups were observed across the current study: Poor Quality, High Quantity and Good Quality. The latent subgroups identified explained a substantial amount of variance in the profiled variables, most motivational covariates and engagement outcomes. As predicted, students within the more adaptive subgroups were observed to be the most engaged. Also as expected, the more adaptive subgroups were the most stable across the current study. Finally, the pattern of transitions across three measurement points and 2 years, with a representative cohort of Japanese students, suggests that teachers in these schools may be helping students to “experience the joy of communication in the foreign language” (MEXT 2008). These results offer specific theoretical and practical implications.

Implications for theory

Support for three elementary school profiles

The three-profile pattern fit the data best at each time point, indicating that elementary school students in Japan might more closely conform to the subgroups found by Corpus and Wormington (2014). While high school and university students may show four or more patterns of motivation (e.g., Vansteenkiste et al. 2009; Gillet et al. 2017), their comparative maturity and experience likely explains the presence of more nuanced profiles, including a Low Quantity profile. As commented by Corpus and Wormington (2014), the learning environment in elementary schools may provide better support for students’ autonomous motives. Further, younger students may lack the life experience to develop a sense of Low Quantity motivation (Alexander 2003).

Growing quality and quantity of motivation during elementary school

In line with theoretical arguments on how schools may influence motivation (Ryan and Niemiec 2009), this study illustrates how a low-stakes, high-interest environment may relate to individual students’ motives over time. While prior variable-centered discussions have indicated how motivation itself may develop (Oga-Baldwin et al. 2017), the person-centered analyses here show how students move between profiles. Consistent with the position held by SDT, the general trend toward higher quality and quantity motivation within this sample indicates that the schools in this sample were places that promoted positive well-being (Reeve and Assor 2011). These results are also consistent with the work by Corpus and Wormington (2014), and indicate that while in some situations controlled motivation may remain more stable (Gillet et al. 2017), in Japanese elementary foreign classes the most autonomous motives were the most stable (over 80% across the three transitions, and greater than 90% at each of the two transition points).

Cultural implications

There are often questions of the cross-cultural applicability of different theories (Iyengar and Lepper 1999; Furtak and Kunter 2012). Similar to the United States sample (Corpus and Wormington 2014), students showed three patterns of motivation. In this Japanese sample, elementary students showed a higher propensity toward autonomous motives, steadily improving over the course of 2 years. Students generally started in positive profiles, and the majority moved toward increasing quality of motivation. While students in East Asian contexts may at times show positive results with more socially-controlled motives (e.g., Iyengar and Lepper 1999; Zusho and Clayton 2011), students in current study’s sample demonstrated the highest engagement in the most autonomously motivated profiles. These results corroborate previous findings in the SDT literature set in East Asia (e.g., Jang et al. 2009), and indicate the applicability of the theory to Japanese elementary school education.

Implications for practice

Supporting motivational development

For teachers, the results of this study indicate that motivation can improve given a positive, supportive, and engaging learning environment. Teachers working towards low-stakes, high interest instruction may help students to develop better quality motivation for learning. As shown previously (Oga-Baldwin et al. 2017), support for engagement appears to have a crucial role in helping students to improve autonomous motivation. While the directionality of this relationship cannot be inferred from the current data, students who changed to a better-quality profile tended to show higher engagement than students who remained in or moved toward a poorer-quality profile. Although qualitative triangulation indicates that teachers throughout the district had different patterns of instruction and no overall uniform level of support (Oga-Baldwin and Nakata, under review), according to SDT, a general pattern of increase in autonomous motivation would indicate that more students believed their classroom environments to be supportive of their needs.

The question thereby remains as to the persistence of these motivational profiles. Does this low-stakes environment continue to promote motivation to learn a foreign language for students moving into an environment with higher stakes (i.e., secondary school)? Crucially, students’ proficiency was not assessed or accounted for in the current study. How does students’ sense of efficacy and agency affect their learning and achievement? While assessment is specifically not allowed in elementary foreign language classes for the potentially damaging effect it may have on motivation (MEXT 2008), it is a central feature of secondary education. Further research on this and other environments is necessary to determine the true motivational effects in the transition to secondary school and beyond.

Limitations and future directions

Data here come entirely from self-reported sources, and thus should be considered carefully. While coming from a roughly representative sample of non-urban Japanese areas, the students were also located in only one single school district. Further, achievement data was not available due to national policies regarding the use of language testing in elementary schools (MEXT 2008). At the same time, this study focused on the growth and development of internally regulated motives in elementary schools as outcomes (Moore et al. 2015), and thus was not concerned with achievement.

One question that remains is whether this growth in the quantity and quality of motivation can be maintained into secondary school. While the goal of elementary foreign language study is to promote interest and positive affect for foreign language (MEXT 2008), this goal is couched within the larger goal of raising lifelong learners. As such, future research will need to investigate how students’ motivation continues to grow and change across formal education and the life span.

Conclusions

The above research suggests that these elementary students were likely to engage in their studies for internally regulated reasons. Students in Japanese elementary schools showed three profiles, much like those in elementary schools in the United States (Corpus and Wormington 2014), trending toward a shift from lower to higher quality motivation. Theoretically, the results also indicate that schools can indeed be places which promote autonomous motivation (Reeve and Assor 2011), even in societies which may be ostensibly more oriented toward top-down control (Iyengar and Lepper 1999). Combined, these findings indicate that elementary students in Japan may develop a sense of internally regulated motives for learning English. The current study’s findings hint that by providing adequate support, teachers can help students develop adaptive motives for studying English as a foreign language.

The goals of Japan’s national curriculum are clearly aligned towards improving the quality of students’ motivation (MEXT 2008). At the same time, it is important to remember that developing positive affect may be a means rather than a terminus for teachers and students in the context of schools. For meaningful integration into the global English speaking community, motivation is a necessary but not sufficient pre-requisite. In short, students may achieve the goal of positive affect, but that affect must eventually translate into the harder work of thinking in and using the foreign language. While continuing studies are necessary in secondary schools with consideration for how students develop real world skills, the current study indicates a positive trend toward promoting students’ motivation and engagement.