Keywords

There has been an increasing interest in applying digital games in teaching mathematics. In many countries, teachers and educational authorities are concerned about a decreasing motivation to study mathematics, and educational games are considered to be a solution which would make mathematics education more fun and motivating. For example, a survey carried out amongst Finnish school teachers by Klemmetti and colleagues (2009) revealed that 99 % of respondents believed game-based learning environments would motivate students’ learning. Despite this assumption, there is a lack of empirical studies providing evidence that game-based learning environments are able to significantly increase motivation towards learning (Connolly, Boyle, MacArthur, Hainey, & Boyle, 2012), at least not when compared to conventional instruction methods (Wouters, van Nimwegen, van Oostendorp, & van der Spek, 2013).

A prototype game-based learning environment, Number Navigation Game (NNG), was developed to enhance students’ adaptivity with arithmetic (see NNG description by Lehtinen et al., in this volume). NNG is based on an integrated game design (Habgood & Ainsworth, 2011) in which the game mechanics offer extensive and situated practice through which flexible and adaptive arithmetic problem-solving skills can be strengthened. The tasks players have to solve during gameplay are increasingly demanding and substantially differ from the tasks used in regular mathematics education. In the prototype game-based learning environment used for this study there are still few external motivating elements so students’ engagement derived from the gaming mechanics itself. The aim of the present study was to find out the effects of playing NNG on students’ motivation towards mathematics as well as on their arithmetic fluency, and whether these effects are related to students’ game experiences.

Expectancy-Value Model as a Comprehensive Measure for Studying Motivation

Motivation is a broad concept for which numerous and sometimes overlapping theories exist (De Brabander & Martens, 2014). Motivation is often described as a set of cognitive motives which, together with emotions (which are alternately considered as a subset of motivation or as something separate from it) influence behavior; these motives can include beliefs, values, expectancies, intentions, or goals (Wegge, 2001). In this study, motivation was looked at from the theoretical perspective of the expectancy-values model, more specifically, Eccles and colleagues’ (2002) expectancy-values model. The expectancy-values model is particularly suitable for this study because its usefulness in predicting students’ future performance, persistence, and task choice has been demonstrated in educational studies (Berger & Karabenick, 2011; Eccles & Wigfield, 2002; Wigfield & Cambria, 2010). In this model, the expectation to succeed and the value given to succeeding will determine a person’s motivation to perform tasks (Wigfield, 1994). The development of expectancy-values is influenced by psychological, sociocultural, and contextual factors, such as the feedback a child receives from parents, schools, and peers (Wigfield & Cambria, 2010). Expectancy-values are already distinct in young children (Wigfield & Eccles, 2000).

Expectancy refers to how well a person believes they will perform a task whereas values refer to a person’s reasons for engaging in a task (Wigfield & Cambria, 2010). Expectancy is understood as a personal expectation or belief in one’s own ability to succeed at a task. Value is task-subjective and composed of four aspects: intrinsic value, attainment value, utility value, and cost. Intrinsic value refers to interest, or how enjoyable a person finds the task, and is linked to Ryan and Deci’s (2000) concept of intrinsic motivation (Wigfield & Eccles, 2000). Attainment value is determined by how important it is for a person’s identity to perform well at the task. Utility value is defined as how useful the task is for a person’s life. Cost, the least studied of these values, focuses on the perceived price a person feels they must pay in order to perform well on a task, both in terms of effort and time. Throughout the present study, motivation towards math is studied through the expectancy-value model, focusing on the variables of self-efficacy and interest, utility, attainment value, and cost.

Game Experience

Students’ experiences during gaming depend on game features and on students’ individual interpretations of these features. These interpretations may be related to age and gender. Interindividual differences in experiencing the game and gaming situations can mediate motivational and cognitive effects of educational games (Järvelä et al., 2000; Lowyck, Lehtinen, & Elen, 2004). Nevertheless, a common understanding of game experience has yet to be found (IJsselsteijn, de Kort, Poels, Jurgelionis, & Bellotti, 2007; Kiili, Lainema, De Freitas, & Arnab, 2014; Nacke & Drachen, 2011), though different frameworks and models have been proposed (see review by Nacke & Drachen, 2011). IJsselsteijn and colleagues (2007) argued this is because the field is relatively young, and the great variety of games makes it difficult to find a “one-size-fits-all” method to study all experiences elicited by games.

For the purposes of this study, game experience was considered from the framework developed by Poels and colleagues (2007). Their framework is composed of seven dimensions which are measured post-play through the Game Experience Questionnaire (GEQ). Although originally developed to measure users’ experiences with commercial games for entertainment, the GEQ has also been used for game-based learning environments (De Grove, Van Looy, & Courtois, 2010; Gajadhar, Nap, De Kort, & IJsselsteijn, 2008; IJsselsteijn et al., 2007; Nacke, Stellmach, & Lindley, 2011; Oksanen, 2013; Poels, IJsselsteijn, de Kort, & Van Iersel, 2010). The seven dimensions of this framework are competence, challenge, flow, (sensory and imaginative) immersion, negative affect, positive affect, and tension.

The first four of the dimensions are highly interconnected. Flow is a key aspect of most existing game experience frameworks, and it is deeply related to the dimensions of challenge and competence. In Csikszentmihalyi’s flow theory (1991), one of the characteristics of flow is the balance between challenge and ability. Challenge must be neither too low, which would result in boredom, nor too high, which would result in frustration. In the flow theory, the balance of “challenge-skill” is considered to be one out of several components of flow and is crucial in educational games (Kiili et al., 2014). IJsselsteijn and colleagues (2007) characterize flow as a form of immersion which results from a player feeling there is balance between how challenging the game is and how competent they are. Jennett and colleagues (2008) distinguish between immersion and flow in that the latter leads to optimal experiences whereas the former does not necessarily do so. However, they admit these concepts may overlap. According to Ermi and Mäyrä (2005), immersion is seen as a broad concept in which (a) flow, or challenge-immersion, is separate from (b) sensory immersion and (c) imaginative immersion. Sensory immersion refers to audiovisual characteristics of the game, such as graphics or sound, whereas imaginative immersion refers for instance to the game’s narration or characters. Whereas Ermi and Mäyrä (2005) look at these three types of immersion independently from one another, in the framework of Poels and colleagues (2007) used for the present study both sensory and imaginative immersion together conform one sole dimension, which refers to the absorption a player might feel towards game features such as story, game world, graphics, or sound.

The other dimensions included in the GEQ—positive affect, negative affect, and tension—focus on post-play affective states which indicate how enjoyable the game experience was. Nacke and Lindley (2009) argue that affect is an essential part of game experience, as it influences the cognitive decisions players take while playing. Based also on physiological responses, they report a correlation between flow and positive affect. Complementing this framework of game experience, our study included an additional dimension of “positive value,” which measures students’ belief that the game is helpful to them. Whitton (2010b) has argued that in order to benefit from game-based learning, users must first believe in the positive value of these games. While she was speaking of adult learners, it is here considered equally relevant for children who are playing games for educational purposes in school.

It is necessary to understand the types of game experiences students have when playing NNG, as it is clear that positive game experiences foster engagement while negative ones hinder it. An unengaged student might stop playing or only continue reluctantly (Oksanen, 2013), which might have repercussions on the effectiveness of NNG in enhancing students’ motivation expectancy-values and arithmetic skills. Thus, examining the effect of students’ game experiences on NNG’s motivation expectancy-values and arithmetic fluency was part of the study’s main aim.

Arithmetic Skills

In testing motivational effects of game-based learning, it is also important to compare them with possible cognitive gains. The objective of NNG is to enhance students’ adaptivity with arithmetic problem solving and adaptive number knowledge (see Brezovszky et al., in this volume; Lehtinen et al., in this volume). However, in the present study about the motivational effects of the game, arithmetic fluency measures are used as an indicator of arithmetic skill development because the adaptive number knowledge results are not yet available. Arithmetic fluency refers to the quick and accurate retrieval of basic number facts and combinations and is a requisite for further conceptual and procedural development (Baroody, Bajwa, & Eiland, 2009; Canobi, 2009). While enhancing arithmetic fluency is not the main goal of NNG, given its relation to adaptive number knowledge (McMullen, Brezovszky, Rodríguez-Aflecht, Pongsakdi, & Lehtinen, 2015), it is used in the present study as a proximal indicator of the game’s mathematical impact.

Research Questions

This study focuses on the following research questions:

  1. 1.

    What is the effect of the playing NNG on students’ motivation towards mathematics, as framed by the expectancy-value model?

    Prior evidence on the motivational effects of game-based learning environments is mixed (Connolly et al., 2012). Thus, we assume that there is no strong overall development in math motivation, particularly when taking into account that the game-based learning environment used is a prototype that includes few externally motivating elements typical for commercial games. However, as gaming itself is already different from regular mathematics education, we assume that this, together with the nonstandard tasks to be solved within NNG, will produce a novelty effect and result in a slight increase in interest in mathematics.

  2. 2.

    What are students’ experiences with the game and how do these experiences differ by gender and grade level?

    Considering that NNG is a prototype still lacking many externally motivating features common in the commercial games children are accustomed to (sound, advanced graphics, etc.), we expect students to rate their experiences close to the scales’ midpoints. Gender differences have been much examined, often focusing on frequency of play, types of games preferred, and self-efficacy beliefs (Bourgonjon, Valcke, Soetaert, & Schellens, 2010; Carr, 2005; Jenson & de Castell, 2010). While NNG is meant to be gender-neutral, based on findings of earlier studies (e.g., Lucas & Sherry, 2004) we assume boys might report more positive game experiences. As the same version with the same difficulty level of NNG was used for children in different grade levels and different stages of their arithmetical development, we expect differences by grade level in game experiences, particularly those of challenge and competence.

  3. 3.

    How are students’ game experiences with NNG related to changes in (a) motivation expectancy-values and (b) arithmetic fluency?

    Empirical studies directly analyzing the relationship between game experience and motivational effects are still rare. Based on earlier studies (e.g., the seminal work of Lepper & Malone, 1987) Paras and Bizzocchi (2005) concluded that if games foster play and challenge, which produces a state of flow, then gameplay can result in increases in motivation, which supports the learning process. Students who have more positive experiences with the game are probably more engaged, and this might result in positive motivational and cognitive consequences. There was no systematic scaffolding or teacher support in this experiment and the “energy maps” (see NNG description by Lehtinen and colleagues, in this volume) are demanding, which can have diverse effects on students’ mathematics self-efficacy beliefs. Students who did not experience competence during gameplay might report lower mathematics self-efficacy at post-test, whereas those who felt competent during gameplay might report stronger mathematics self-efficacy.

Method

Participants

In this study, 1168 students from 61 fourth through sixth grade classrooms spread across four cities in Finland participated. Participation was voluntary both for teachers and students, and informed consent was acquired in writing from the parents of all participants. Ethical guidelines of Turku University were followed. From the total, 546 participants were female, 620 were male, and there was missing data on the gender of two participants. As for grade level, 135 participants were fourth graders, 606 were fifth graders, and 427 were sixth graders. The mean ages for the fourth, fifth, and sixth grade participants were 10 years and 2 months, 11 years and 2 months, and 12 years and 3 months, respectively. Classes were randomly assigned into control and experimental groups, with 642 participants belonging to the experimental group and 526 to the control group.

Procedure

During the spring term 2014, the experimental group played NNG for a 10-week period as part of their regular math classes and curriculum, while the control group continued only with their regular textbook-based mathematics curriculum. Afterwards, conditions were reversed. While the present study only encompasses this first phase of the experiment, it is relevant to mention the reversal of conditions not only because it would have been ethically questionable to deny participants in the control group the chance to play, but also because the control group’s knowledge about the upcoming play sessions could have an impact on some post-test measures. It was asked that students play for at least 10 h. Teachers were invited for a training session in which they were informed about NNG’s learning aims and play mechanics. As part of their training, teachers were told sessions needed to last at least 30 min in order to give their students enough time to make significant progress in the game. Nevertheless, teachers were free to decide how long play sessions would extend, how to space these sessions throughout the intervention, what kind of support they would provide their students, and whether students would play individually or in pairs. There were no instructions for teacher support during the gaming processes. In case students played in pairs, teachers chose the criteria under which pairs would be formed.

Measures

Data used for this study was collected by questionnaires and math tests completed by students before the 10-week intervention and immediately after the 10 weeks. Students in the experimental group received a copy of the game immediately upon completing the pre-test, and class teachers were free to schedule game sessions as they saw fit. Game log data was collected upon completing the post-test, but it is not analyzed in the current study. Both pre- and post-tests were rigorously timed and structured, and were imparted by trained testers following standardized procedures. Both pre- and post-questionnaires were filled out by students during regular class time under the guidance of their teachers. The pre-questionnaire was identical for all participants, containing demographic items and items measuring their math motivation expectancy-values, while the experimental group’s post-questionnaire included additional items concerning their game experiences playing NNG.

Math Motivation Expectancy-Values: Fourteen items measuring math expectancy-values were completed before and after the intervention by all participants. The test was modified on the basis of the motivation scale used by Berger and Karabenick (2011), with items being translated into Finnish and adapted to the ages of respondents. Three items were used to measure interest (for example, “I like math”). Three items measured utility (“Math is useful for me in everyday life”). Three items measured attainment value (“It is important to me to be a student who is good at math”). Two items were used to measure cost (“I believe that success in math requires that I give up other activities that I enjoy”). Three items were used to measure self-efficacy, (“I am certain I can do difficult math tasks”). Participants responded to each item using a 5-point Likert scale ranging from 1 (completely disagree), 2 (disagree), 3 (neutral), 4 (agree), to 5 (completely agree). These items were studied through principal component analysis with varimax rotation. Five separate factors (interest, utility, attainment value, self-efficacy, and cost) were found, upholding the 5-factor model developed by Eccles and Wigfield (2002). The explained variance of the model was 75.90 %. Data was adequate for factor analysis with a 0.90 Kaiser–Meyer–Olkin Measure, and Barlett’s test of sphericity showed a significance of p < 0.001. All but one factor were shown to have good internal consistency and to be reliable across the two tests. At pre-test: interest Cronbach’s α = .91, utility: α = .80, attainment value: α = .82, cost: α = .50, and self-efficacy α = .81. At post-test: interest Cronbach’s α = .91, utility: α = .79, attainment value: α = .83, cost: α = .58, and self-efficacy α = .81. The poor reliability of cost compared to the other measures can be due to this dimension only having two items—consequently, the expectancy-value of cost was not used for any further analyses. Correlations of the variables at pre-test can be found in the Appendix.

Game Experience: Only participants in the experimental group were asked to fill the Game Experience Questionnaire (GEQ) after the intervention. The Finnish translations used by Oksanen (2013) were used, although the questionnaire was further modified by removing 15 of the 42 items and changing some of the phrasings to better suit our game and the age of our participants. Each item consisted of a statement and a 1–5 scale to indicate level of agreement, with answers ranging from 1 (not at all) to 5 (extremely), with the mid-scale 3 being neutral. The factor structure of the 31 items of GEQ was studied through principal component analysis with varimax rotation. Data was adequate for factor analysis with a 0.95 Kaiser–Meyer–Olkin Measure and Barlett’s test of sphericity showed a significance of p < .001. Seven separate factors were found and used as basis for the subscales. The explained variance of the model was 69.60 %. The reliability of subscales was as follows: Challenge, (e.g., “I thought playing this game was hard”), α = .66. Competence, (“I was good at playing”), α = .81. Flow, (“I forgot everything around me when I played”), α = .79. Immersion, (“I felt imaginative when I played”), α = .77. Negative affect, (“I thought playing was boring”), α = .78. Positive Affect, (“I thought playing was fun”), α = .92. Positive value, (“This game helped me learn math”), α = .82. Tension, (“I felt irritable when I played”), α = .77. The reliability for challenge is low, which could be due to the removal of three items, although Oksanen (2013) also reported similar results. However, according to Clark and Watson (1995), this reliability is within the limits which can be used. As suggested by the theory, there are high correlations between flow, challenge, and competence (Appendix).

Arithmetic Fluency: Students’ fluency in solving basic arithmetic tasks is used in this study as a basic measure of cognitive outcomes. The test was adapted from the Mathematical Fluency test of the Woodcock-Johnson Tests of Achievement (Woodcock, McGrew, & Mather, 2001), in which students have three minutes to answer as many simple arithmetic problems as they can. The minimum score was 0 and the maximum score 160. Changes made to the original test include replacing the multiplication symbol with its equivalent in Finland, that is, ∙ was used instead of x.

Results

Results are organized into three subsections. The first subsection presents the effects of the intervention on math motivation expectancy-values. The second subsection describes the experimental group’s game experiences with NNG as well as gender and grade level differences in these experiences. Finally, the third subsection explores how the experimental group’s game experiences with NNG related to changes in their (a) math motivation expectancy-values and (b) arithmetic fluency.

The Effects of Intervention on Math Motivation and Arithmetic Skills

Descriptive statistics on pre- and post-test math expectancy-values subscales are presented in Table 1. A repeated measures ANOVA analyzing the effects of time (pre- and post-test) and condition (experimental or control) on math motivation expectancy-values measures was conducted. Overall, there was no main effect of time on interest or utility but there was a slight decrease in attainment value, F(1, 1166) = 8.80, p = .003, ηp 2 = 0.01 and in self-efficacy, F(1, 1166) = 5.14, p = .02, ηp 2 = 0.004. There was a small interaction effect of time and condition on interest, F(1, 1166) = 13.21, p = .000, ηp 2 = 0.011, on utility, F(1, 1166) = 7.15, p = .008, ηp 2 = 0.01, and on attainment value, F(1, 1166) = 7.51, p = .006, ηp 2 = 0.01, showing a small decrease in these motivational aspects amongst the experimental group when compared with the control group.

Table 1 Descriptive statistics on motivation expectancy-values

There was no difference in the arithmetic fluency scores in the pre-test (experimental group M = 70.56; control group M = 70.23). Arithmetic fluency scores increased significantly in the post-test (experimental group M = 80.15; control group M = 77.87), F(1,1166) = 589.52, p = .000. There was a significant interaction effect of time and condition, F(1,1166) = 5.99, p = .015, ηp 2 = 0.01, showing a small positive intervention effect on arithmetic fluency.

Game Experiences with NNG

Mean scores (Table 2) for the different dimensions of the GEQ averaged between 2 and 3. An independent samples t-test was run to determine whether there were differences in game experiences between girls and boys. Results showed significant differences between girls’ and boys’ experiences of challenge and competence while playing NNG; however, effect sizes were small (Table 2). Girls rated the game as more challenging than boys did, which means they were more likely to find the game difficult and that they had to make an effort when playing. Boys had higher competence scores than girls did, which means they were more likely to report feeling good, successful, and skillful while playing.

Table 2 Game experiences of the experimental group, by gender, and gender differences

A one-way ANOVA was carried out to look at grade level differences in game experiences (Table 3). There were significant differences between the class levels in Flow, Immersion, Negative Affect, Positive Value, and Tension. Post hoc comparisons using Bonferroni correction indicated that fifth graders’ experiences of immersion was significantly higher than sixth graders’ (Mean difference = 0.18, p = .033). This means that fifth graders reported higher feelings of being imaginative, liking the story, and being able to explore.

Table 3 Game experiences of experimental group by grade level

The scores of students’ belief in the positive value of the game also significantly differed between fourth graders and sixth graders, with fourth graders seeing more benefit in playing NNG (Mean difference = 0.30, p = .049) and feeling the game helped them learn math. Scores for negative affect significantly differed both between fourth graders, with fifth graders having a higher score in negative affect (Mean difference = 0.36, p = .021) and between fourth and sixth graders, with sixth graders having a higher score in negative affect (Mean difference = 0.44, p = .004). That is, the youngest students were less likely to report feelings of boredom while playing. However, fourth graders were more likely to feel annoyed or irritable while playing than fifth graders, with the scores for tension significantly differing between fifth and fourth graders (Mean difference = 0.46, p = .014). There were no significant differences between grade levels in the experiences of challenge, competence, or positive affect.

Effects of Game Experiences on Math Motivation Expectancy-Values and Arithmetic Fluency

In order to determine how the experimental group’s game experiences related to their post-test scores for (a) math motivation expectancy-values and (b) arithmetic fluency, multiple linear regression analyses were conducted. As dependent variables, the post-test sum scores of the math motivation expectancy-values of interest, utility, attainment value, and self-efficacy were used, as well as arithmetic fluency post-test scores. For each dependent variable, its corresponding variable at pre-test as well as all game experience variables and arithmetic fluency were included as independent variables. Table 4 provides the results of these analyses.

Table 4 Regression analyses on post-test math motivation expectancy-values

Models were sufficient for explaining the post-test (Total R 2s > .33). For all the math motivation expectancy-values of interest, utility, attainment value, and self-efficacy, the corresponding pre-test variables proved to be the strongest predictive variables. In all these cases, the only other significant predictor of post-test scores was the game experience of competence (βs > .12). In the case of arithmetic fluency, F(9, 997) = 236.82, p < .001, R 2 = .68, only pre-test arithmetic fluency was a predictor (β = .82, p < .001), with game experiences not playing any role on post-test results.

Discussion

The aim of the present study was to investigate the effects of gameplay on students’ math motivation as measured by using the expectancy-value model, and to explore how students’ differing game experiences were related to the changes in their math motivation and arithmetic fluency.

There was a slight decrease in two of the motivation dimensions, attainment value and self-efficacy, for all participants in both experimental and control groups, which is in line with previous research reporting a general decrease of expectancy-values throughout the school term (Berger & Karabenick, 2011; Wigfield & Cambria, 2010). The control group’s interest and utility slightly increased from pre-test to post-test. This could possibly be explained by the fact that participants in the control group anticipated the reversal of conditions and their interest and beliefs in the utility of math were sparked by the upcoming NNG intervention, although data is not sufficient to determine this. For the most part, playing NNG did not have a large impact on students’ math motivation expectancy-values. Compared to the control group, the experimental group showed a slight decrease in three of the dimensions of the expectancy-value model (interest, utility, and attainment value), but all these effects were quite small. In spite of this, there was a slightly positive intervention effect on arithmetic fluency.

When focusing on the game experiences of the experimental group, results show that participants rated their game experiences in a predominantly negative or only slightly positive way. The game experience with the lowest mean score was immersion while the one with the highest mean score was competence. Participants reported higher feelings of competence than of challenge. Thus, the challenge-skill balance necessary to produce the game experience of flow, which leads to positive affect, was not reached. The high scores for competence suggest that the game might have been perceived as too simple, although it is unclear whether this was due to the game’s form (gameplay) or content (arithmetic strategies needed) or both. Results support Whitton’s (2010a) claim that it cannot be assumed that a game dynamic will automatically make something interesting to learners who have no interest in the subject itself. More studies are needed to explore the impact of background factors, such as interest in math and in games, on game experiences. It is also important to further study to what extent improving the design of the game can foster positive game experiences.

There were some differences in students’ game experiences depending on gender and grade level. Girls had higher mean scores for the dimension of challenge while boys had higher mean scores for the dimension of competence. However, it is not clear whether this is due to different perceptions of math skills or gaming skills between genders. Previous research suggests that boys have higher competence beliefs than girls for math even when controlling for skill level, although the gap in gender differences narrows with age (Wigfield & Eccles, 2002). Altogether, competence and challenge significantly differ by gender but not by grade level. Other dimensions such as immersion, positive value, negative affect, and tension differ by grade level, with the mean scores of fourth graders showing the most substantial differences when compared with students from other grade levels. It seems that younger students overall had more positive game experiences than older students. Here, it is important to note the much smaller number of participants in the fourth grade. Finally, the dimension of positive affect is the only dimension of game experience that shows no significant differences by gender or grade level.

Pre-test math motivation expectancy-values played a larger role in predicting post-test math motivation expectancy values than the different dimensions of game experiences, suggesting that there was little change in expectancy-values due to gameplay. Amongst game experiences, competence was the strongest predictor of post-test expectancy-values in all cases. However, this raises some questions about the validity of the dimension of competence as currently measured by the GEQ. It is not clear whether students interpreted the competence items of the GEQ as referring to their math competence or their gaming competence or both. If the former, then it seems that the items for competence and self-efficacy might have overlapped. However, as the correlation between the variables is only moderate (r = .44), this needs to be further explored. Similarly, game experiences did not predict post-test arithmetic fluency, suggesting that game experiences may not play a role in mathematical learning outcomes.

Implications

It seems that NNG’s mechanics as such were not motivating for the majority of students, even though gameplay resulted in improvement in mathematical skills. Math motivation expectancy-values remained mostly stable, and although the experimental group showed a slight decrease in interest, utility, attainment value, and self-efficacy, these effect sizes were very small. The changes in expectancy-values of the experimental group could partly be predicted by their experience of competence during gameplay, which in turn differs by gender. As the basic mechanics of the game seems to work and resulted in improved mathematical skills, a next step would be to analyze whether new features of later versions of the game will lead to meaningful improvements in gaming experiences. It is encouraging that regardless of the quality of gaming experience, NNG is still effective in improving arithmetic fluency.

Limitations and Future Directions

Conditions could somewhat vary between classrooms, and there is no detailed information what the role of the teacher was in, for example, debriefing, feedback, support activities, or reflection. However, giving teachers the freedom to use the game as they saw fit allowed for testing the effectiveness of the game in the most natural school settings possible. The detailed log data which will be analyzed in the future will give some information of these differences, but in future studies it is important to collect detailed data of teachers’ roles during gameplay.

A major limitation of this study is its dependence on subjective and self-reported data. Informal feedback from teachers paints a different picture of students’ experiences, as many teachers claimed their students were very engaged while playing and enjoyed the experience. This will be remedied in the future with the addition of a feedback feature within the game itself, which will make it possible for players to give feedback on their affective states upon completing a map, in a situated way that does not disrupt flow.

When looking at game-based learning environments, it’s important to acknowledge their oxymoronic nature (Abt, 1987). Jenkins (2011) brings up the contradictory relationship of playing—a “freely chosen irresponsibility”—and learning—an “assigned responsibility.” Along similar lines, it has been argued that having teachers decide what games will be played, for how long, and under which circumstances, will have repercussions on the level of control felt by students and consequentially on their motivation (Wouters et al., 2013). Different results may be achieved when play is free and voluntary, as opposed to a formal and prescribed school activity (Islas Sedano, Leendertz, Vinni, Sutinen, & Ellis, 2013), as it has been reported that playing in a different context, such as at home, increases players’ enjoyment, identification, and learning experiences (De Grove, Van Looy, Neys, & Jansz, 2012). An important next step could be to study the effects on motivation and core gaming experiences of having students play voluntarily at their homes.