1 Introduction

The use of data from playtesting earlier versions of a game is a common and valuable approach to improve user experience of entertainment games (for a basic introduction see [1]). These so-called game analytics aim at discovering and communicating patterns in data to inform, for instance, the ongoing design process of a game to optimize user experience. Such game analytics might be even more important in developing successful educational games by not only optimizing user experience but learning outcomes in particular (e.g., [2, 3]). In the current study, we discuss the background of developing a new task and game mechanic based on previous research results and reanalyses of user data from an earlier version of an educational game designed for learning rational numbers.

Educational games are primarily defined by their educational value. While general models such as the Learning Mechanics-Game Mechanics (LM-GM) model [4] can support the development process, a more in-depth and domain specific analysis is necessary to optimize learning outcomes. As such, findings from basic research can inform design decisions of an educational game early on (e.g., [5,6,7]). During the development process of the game employed in the current study, we considered recent findings from numerical cognition research (e.g., [8] for a review on rational number knowledge) to adapt game mechanics for increasing learning outcomes.

The game of interest, Semideus, primarily focuses on assessing and improving rational number (i.e., fractions and decimals) knowledge in young students. Fractions are one of the most challenging topics in mathematics education [9]. They are also crucial for learning of more advanced mathematics, as indicated by it being a very good predictor of future algebra performance and overall mathematics achievement [10, 11]. Unfortunately, many students fail to master fractions [8, 9]. One major difficulty for students is understanding fraction magnitude (for a review, see [8]). Importantly, two types of tasks are used primarily for assessing and improving fraction magnitude knowledge: number line estimation and magnitude comparison tasks (e.g., [12,13,14,15] for a recent review see [16]). Consequently, these two tasks were considered crucial for a game, which focuses on improving rational number knowledge.

The number line estimation task is based on the metaphor of a mental number line, according to which small/large magnitudes are associated with the left/right side of space in western cultures (for an overview see [17]). Accordingly, in number line estimation (e.g., [18]), participants have to indicate the spatial position of a target number on a horizontal line with only its start and endpoint specified [e.g., where does \( {{\kern-0pt} {4/5}}\,(67) \) go on a number line ranging from 0–1 (0–100)]. Basic research suggests that training of such spatial representations of number magnitude corroborates more than just the accurate mapping of numbers on the mental number line but generalizes to other numerical competencies. As a consequence, training involving the concept of a mental number line seems to be particularly effective when promoting young students’ numerical competencies (e.g., [19, 20]). Hence, Semideus uses number line estimation as its most basic task.

In the typical number magnitude comparison task, participants have to decide which of two numbers is larger [e.g., \( {4 \mathord{\left/ {\vphantom {4 5}} \right. \kern-0pt} 5} \) (80) is larger than \( {2 \mathord{\left/ {\vphantom {2 3}} \right. \kern-0pt} 3} \) (66)]. Interestingly, participants’ responses get longer and are more error prone the closer in magnitude the to-be-compared numbers get (e.g., [21, 22]). This so-called numerical distance effect indicates a successful spatial representation of number magnitude and supports the idea of a mental number line even though, actual comparison of spatial magnitudes is not explicitly required in this task. Interestingly, performance on both number line estimation and number magnitude comparison task using whole numbers (e.g., [23,24,25]) is highly correlated for both whole numbers and fractions, though so far only two studies have examined this relation with fractions [12, 26]. Importantly, studies involving training of number line estimation often use magnitude comparison as an evaluation task to assess training effects (e.g., [27]). It is assumed that it should be easier to differentiate numbers by their magnitude after participants have been trained to estimate the location of a magnitude on a number line. However, at least for fraction learning, it is unclear whether this relation is preserved when controlling for individuals’ overall math achievement or math grades, respectively, which hasn’t been done in previous studies. In case this relationship disappears, one might argue that the association between number line estimation and magnitude comparison is due to better overall math performance rather than indicating a shared underlying representation of number magnitude.

When implementing magnitude comparison in our game, we changed the conventional number magnitude comparison task a little bit to enhance students’ spatial representation of number magnitude. In particular, we designed the comparison to take place on a number line with its endpoints defined as “small” on the left and “large” on the right (see Game Description and Fig. 1 right chart). Participants had to place numbers on the number line in correct ascending order. On this task, absolute correct positioning of to-be-compared numbers is irrelevant, but the basic characteristics of a comparison task are maintained, and spatial aspects of number magnitude might be enhanced. However, in a recent training study [15], we only identified significant improvements on estimation tasks when employing rational number estimation and our adapted comparison task. We argued that the magnitude comparison task in its current form might not accurately assess the spatial representation of number magnitude. Improving this magnitude comparison task might be necessary to accurately assess rational number knowledge.

Fig. 1.
figure 1

Left chart: Example of an estimation task; Right chart: Example of a comparison task; klein = small, gross = large.

To increase the educational value of our game, the current study examined and reanalysed data from two previous studies [14, 15] employing our game-based estimation and adapted magnitude comparison task in two contexts (i.e., assessment and training) and two age groups (4th and 5th graders). We evaluated the relation between performance in number line estimation and magnitude comparison. The goal was to examine whether these two tasks share the same underlying representation. This would indicate that a combination of number line estimation and magnitude comparison task mechanics might improve future assessment and training of rational number knowledge.

2 Method

In Study 1 ([14], Assessment-study), our game-based learning environment was used to assess conceptual knowledge of fractions in Finnish fifth graders. We were able to replicate hallmark effects of fraction magnitude processing typically observed in basic research, such as the numerical distance effect. This suggested that game-based learning environments for fraction education may also allow for a valid assessment of students’ fraction knowledge. In Study 2 ([15], Training-study), we employed the same game-based environment as a training tool to improve fourth graders conceptual knowledge of fractions. Results indicated that the game-based training group improved their conceptual knowledge of fractions more strongly than a control group. Reanalysing data from both studies allowed us to examine and compare data from different contexts and age groups. This provides us with the promising opportunity to revisit earlier iterations from a game-based learning environment to influence the ongoing design process. Importantly, analysing and comparing data from a training and assessment study allows us to transfer our results from one context to the other. In this section, we first describe the game based environment and then other relevant aspects of both studies. For a more comprehensive description of the two studies see [14, 15].

2.1 Game Description

In both studies, similar versions of the game-based learning application Semideus were used. Semideus is a game engine that allows for creating game-based tasks for improving and assessing rational number knowledge (e.g., [14, 28]). The game is set in ancient Greek times. Users control the avatar “Semideus” (by tilting the tablet), who tries to find and retrieve gold coins that the goblin Kobalos has stolen from Zeus. Kobalos has hidden the coins along the trails of Mount Olympos. Semideus has to discover the locations of the hidden coins, encrypted in mathematical symbols (e.g., fractions). The core gameplay requires working with number lines (e.g., [26]) to perform number line estimations and magnitude comparisons. Number lines were implemented as walkable platforms.

In the estimation tasks, users had to move the avatar to the correct position on the number line. Figure 1 (left chart) shows an example of an estimation task in which players had to locate the fraction 5/6 on a number line. Estimates more than 8% off from the correct position were categorized as inaccurate. For correct estimates, players received 100–500 coins depending on the degree of accuracy.

In our adapted magnitude comparison task, users had to arrange two stones with an engraved number on them in ascending order on a number line ranging from “small” to “large”. Hence, the absolute position of to-be-compared numbers on the number line was irrelevant as long as the relative position of the stones to each other was correct. For instance, players had to place the stone “2/5” left from stone “1/2” for the placement to be correct. Correct comparisons were rewarded with 100-500 coins depending on the response time of the users, with higher rewards for faster responses.

2.2 Study 1 (Assessment Study)

Participants:

Fifty-four fifth-graders (25 male; mean age = 11.26 years, SD = 0.48 years) participated in the study. They were equipped with iPads and had 30 min to complete the game. Seven students failed to provide their math grades. Thus, only 47 students were included in the final analyses.

Procedure:

All students were examined during regular school hours. First, experimenters introduced the game and explained game mechanics to students. Then, students received their user account, which was used to record individual game behaviour. Each student received an iPad and played the game individually. They were not allowed to discuss the game with other students during the play session. Math achievement was measured by participants’ previous math grade (Finnish classification scheme: 10 reflects the highest and 4 the lowest grade).

2.3 Study 2 (Training Study)

Participants:

In the training group, 68 fourth-graders played the game, of who 54 (mean age = 10.24 years; SD = .43; 25 males) followed the requested protocol. That is, they participated in both the pre- and posttest and played the game in-between. Another 45 students were recruited for the control group, of who 41 (mean age = 10.02 years; SD = .27; 25 males) participated in both the pre- and posttest. These children did not play the game and therefore are not considered here.

Procedure:

Experimenters explained the game to students before they started to play it. Students had to play the game in 5 sessions of about 30 min each during a four-week period. There was no additional teaching of rational numbers in school during the study period. Math achievement was again measured by participants’ previous math grade in the Finnish classification system. Log files of students’ estimation and comparison performance during the games were analysed.

2.4 Analysis

Correlations were computed between fraction magnitude comparison performance and fraction estimation accuracy for fractions between 0 and 1 for both the Assessment (Study 1) and the Training study (Study 2). We also conducted partial correlation analyses between fraction magnitude comparison performance and fraction estimation accuracy, controlling for the effects of students’ previous math grade (overall math performance). The control variable (math grade) is the variable which extracts the variance which is obtained from the initial correlation between fraction magnitude comparison performance and fraction estimation accuracy. This allowed us to investigate whether the correlation between performance in these two tasks is still present when controlling for overall math performance. The correlations and partial correlations were conducted using R [29] and the R package corrplot [30]. Data visualization was realized with the package ggplot2 [31].

3 Results

3.1 Study 1 (Assessment Study)

Students’ performance in fraction comparison and fraction estimation was positively correlated [r(45) = 0.64, p < .001, see also Fig. 2 Panel A]. Moreover, after controlling for individual’s overall math achievement the correlation remained positive and significant [r(44) = . 41, p < .005, see also Fig. 2 Panel B].

Fig. 2.
figure 2

Panel A: Scatterplot for comparison performance and estimation accuracy; Panel B: Scatterplot for residuals of comparison performance and estimation accuracy controlling for math grade.

3.2 Study 2 (Training Study)

As in Study 1, students’ performance in fraction comparison and fraction estimation was significantly correlated [r(52) = .62, p < .001, see also Fig. 2 Panel A]. Again, the correlation remained significant even after controlling for individual’s overall math achievement [r(51) = .46, p < .001, see also Fig. 2 Panel B].

4 Discussion

In the present study, we reanalysed data from two previous studies [14, 15] to better understand the relation between number line estimation and magnitude comparison. Medium to large correlations remained between estimation accuracy and comparison performance of fractions even when controlling for overall math achievement, suggesting that both tasks draw on the same underlying representation of number magnitude. In the following, we consider the results of the current study and recent literature to design and propose a new way of assessing and training conceptual knowledge of (rational) number magnitude. This data-driven approach may result in new tasks/mechanics, which might be more beneficial not only for learners but also for educators and researchers in the domain of numerical cognition, as they might provide more detailed information about users’ strategies and competencies.

Current and previous studies indicate that number line estimation and magnitude comparison share the same underlying representation of number magnitude [20, 24, 25]. Our data show, that this cannot be explained by individuals’ overall math performance, as we controlled for this. However, this also suggests that previous implementations of the comparison task within Semideus may not have been optimal as we did not observe significant improvements in comparison performance following the training procedure [15]. This was surprising, because correlations between number line estimation and magnitude comparison performance in fractions were already described elsewhere [12, 26]. To overcome this shortcoming of our current implementation we redesigned the comparison task mechanics. In order to draw users’ attention more explicitly to the use of spatial locations on the number line we combined mechanics of both number line estimation and magnitude comparison. By merging these two tasks, we aim to further foster fraction magnitude understanding in future training studies. In the following, we elucidate this data-driven design decision by detailing the new mechanic:

In its current version, Semideus, as a game-based approach, allows us to seamlessly integrate magnitude comparisons and number line estimates into the gameplay by narrative elements of the game. In particular, the main task of users is to dig up a gold coin at a location on the number line specified at the top left corner besides the shovel button (target number; see Fig. 1 left chart). This reflects the basic mechanic of a number line estimation task (e.g., [18]). By integrating “traps” (i.e., positions to avoid) on the number line, it is possible to integrate number magnitude comparisons into the gameplay. The location of the traps is defined in the top right corner (a lightning symbol refers to traps; see Fig. 3 A–D). Accordingly, users need to decide whether a trap is on their way to the position they need to walk to in the primary number line estimation task. This decision reflects number magnitude comparison (see Fig. 3 A: 2/3 is larger than 4/9; Fig. 3 B: 3/7 is smaller than 2/3). If a user walks through a trap, he/she loses virtual energy (see Fig. 3 C).

Fig. 3.
figure 3

Combination of number estimation and magnitude comparison task mechanics; target location/number defined in the top left corner; traps/locations to avoid defined in the top right corner with a lightning symbol; A: Target number is smaller than trap location; user can safely dig out coins without disarming the trap. B: Target number is larger than trap location; user needs to use the mole to disarm the trap; C: User receives negative feedback when walking over a trap and looses virtual energy (orange bar on the right of the screen; D: User has used the mole to disarm the bomb (mound of earth) to safely walk to the target location.

To integrate magnitude comparison into the narrative of the gameplay, users are asked to disarm the trap when it is on his/her way to the position of the gold coin. Players can disarm the trap by pressing a button located on top of the screen (symbol of a mole; see Fig. 3 A–D). When the user has disarmed the trap with the mole, he/she can walk over it and dig up the coin at the estimated position. To decide whether disarming the trap is necessary, users have to compare the location of the coin (target number) with the location of the trap and based on this decision disarm or not disarm the trap. This requires an explicit comparison of the magnitudes of the target number and the one specifying the location of the trap. Accordingly, this should increase the association of fraction magnitudes with spatial locations on the number line (see a demo of this idea on youtube: https://youtu.be/cFS7USJJ3pI).

Moreover, this suggested comparison mechanic might make comparison tasks easier to solve. The previous comparison mechanic turned out to be challenging for some students. More specifically, some users had difficulty adopting game controls needed to perform the comparison [32].

A further iteration of the current task mechanic, which is being developed, addresses adaptive gameplay, this means user-specific adjustments of task difficulty or game progress. Specifically, instead of a mole disarming the trap at a given position on the number line, users may use a piece of wood as a bridge to cross the trap without harm. This alternative mechanic will have pieces of wood of different lengths. In the beginning, large pieces of wood would cover a wider range of the number line. Thus, knowing approximately where the trap is would be sufficient to avoid it. The available, pieces of wood will become smaller as the person becomes more accurate and advances in the game, which requires more specific localisation of traps. This addition to the task mechanic provides a natural way of changing task difficulty and might also inform researchers and educators about students’ strategies when solving such tasks.

Future work:

Future studies need to evaluate whether this new task/mechanics might (i) be superior to conventional training and assessment methodologies in producing and measuring learning, (ii) provide more information about students’ strategies in solving such fraction magnitude tasks, (iii) extend task diversity, (iv) improve user experience, and (iv) add an additional layer of difficulty for already well performing students. Moreover, in the future we aim at employing similar data and theory-driven approaches to evaluate new design iterations of the game. This will also include the use of game analytics tools to investigate not only performance but also player experience variables, which might also allow us to identify more complex patterns in the data.

Conclusion:

By reanalysing data from an educational game for learning fractions, we identified an association between number line estimation and number magnitude comparison, even after grades in math courses were statistically controlled. This suggests that both tasks share the same underlying representation of number magnitude. Based on these results and those in previous studies in the domain of numerical cognition, we designed a new task combining task mechanics of number line estimation and number magnitude comparison by including the need for explicit magnitude comparison in our number line based game. This new task might improve future training of rational number knowledge. Most important, although the current study described task mechanics for an educational game, its use is not limited to gaming contexts. Instead, the technology might as well be utilized in a wide range of areas of numerical cognition.