Introduction

The compelling and interactive design elements of games could be combined with specific curricular contents into digital game-based learning (DGBL) (Prensky 2003). Games that encompass educational objectives are believed to hold the potential to render learning of academic subjects more learner-centered, enjoyable, and interesting. Although games are believed to be motivational and educationally effective, the empirical evidence to support this assumption is still limited and contradictory (Marina 2009). Particular educators incorporated certain games into physics course to enhance a student’s learning motivation, e.g., the famous Angry Birds. Nevertheless, to what extent those games can enhance learning is an interesting issue. In other words, do educators use games (e.g., Angry Birds) to teach some subjects because games make learning fun or can games really promote students’ performance? This issue is worth further investigation.

To realize the effectiveness of games on learning, this study conducted an experiment in which students played a game to learn physics, i.e., the Newton’s law of motion. Participants were divided into two groups in terms of learning method. One is the traditional learning method, and the other is DGBL by using SURGE physics game. When students were learning, their eye movements, brain waves, and heart-beat data were measured for analyzing their attention, emotions, and problem-solving strategies. All students took a posttest after learning to examine whether the DGBL method could enhance learning or not.

Literature review

Digital game-based learning

Digital games provide a meaningful framework for solving problems (Annetta 2008) since students are fostered to synthesize diverse information and analyze strategies, which leads to a greater understanding of the causal links between decision-making behaviors (Ebner and Holzinger 2007). Therefore, digital games can be regarded as a potential learning tool for understanding the link between cause and effect (Kiili 2005). Although research on problem solving in DGBL has been conducted (Dickey 2006; Robertson and Howells 2008), the effectiveness of this pedagogical approach to enhance problem-solving abilities has not received sufficient attention.

Many studies have unveiled the effects of DGBL, especially on learning interest and motivation (Erhel and Jamet 2013). For instance, Huang et al. (2010) proposed a regression analysis that reveals a significant model between motivational processing (attention, relevance, and confidence) and the outcome processing (satisfaction) based on the data collected by ARCS-based Instructional Materials Motivational Survey (IMMS). According to ARCS scores, learners started out with a successful motivational processing that consisted of a high attention level, a low relevance level, and a high confidence level. At the end of the learning process, however, a relatively low level of satisfaction was observed (Huang et al. 2010). Nevertheless, with its ability to motivate learning, game-based learning can potentially bring much affective experience.

Affective computing in learning

Since the time when affective computing was proposed, there has been a burst of research that focuses on creative technology that can monitor and appropriately respond to a user’s affective state (Picard 1997). Furthermore, research shows that a technology is able to recognize human emotions in different ways. However, why recognizing human emotion is an important research area? The latest scientific findings indicate that emotion plays an essential role in decision-making, perception, learning, and more (Ben Ammar et al. 2010).

Recent evidence suggests that a digital game can affect the players’ emotion. This was tested by employing Facial electromyography (EMG). Facial EMG reveals emotional expression by directly measuring the electrical activity associated with the facial muscle contractions (Ravaja 2004). The signals of facial EMG activity and skin conductance were able to point out players’ responses as well as assessments of players’ emotions during game-play (joy, pleasant relaxation, fear, anger, and depressed feeling) in response to short-duration emotional game event (Ravaja et al. 2008). Ravaja et al. (2008) found that the game events literally can lead to emotion state change. Some other researchers tried to interpret motivation of playing game by the physiological change. For instance, Derbali and Frasson (2010) investigated players’ motivation during serious game play based on a theoretical model of motivation (John Keller’s ARCS model of motivation) and EEG measures. The results showed that power spectral analysis of EEG waves patterns was correlated with the increase of motivation during different parts of serious game play (Derbali and Frasson 2010). In addition, Clark et al. (2011) checked for how similar or different are the learning and affective experiences of students playing the game in two different countries (i.e., Taiwan and the United States). Thus, game-based learning seems obtrusively demonstrating affective state which can be measured by some sensing technology.

Cognitive load measurement in learning via physiological signals

In the online game-based learning, due to its learning pace, learners usually need high cognitive capacity to deal with the swift change of the game. That being said, it may introduce the cognitive overload problem, leading to a fairly unsatisfactory learning experience (Huang 2011). Consequently, it may compromise the learning effectiveness. Thus, how to measure the cognitive load in online game-based learning environment (GBLE) apparently becomes an essential issue. Current available methods to assess cognitive load can be classified into two dimensions, objectivity (subjective or objective) and causal relation (direct or indirect). The objectivity dimension includes three types of measures: physiological, behavioral, and learning outcome (Brunken et al. 2013). Physiological measures include pupil dilation and heart rate. For example, a remote eye-tracking equipment has been used in estimating driver’s cognitive load in a driving simulator (Palinko et al. 2010). In contrast to traditional interviews, the eye-tracking system provides objective evidence of cognitive load, and it has been used in psychology for decades (Yang et al. 2013). The appropriation of eye-tracking method for measuring online cognitive processes based on unveiling the temporal change of visual attention which is used by researchers to interpret how learners process information during learning (Yang et al. 2013). Meanwhile, eye fixation locations can reflect attention distributions based on the eye-mind assumption (Just and Carpenter 1980).

The physiological input signals selected

This study measured physiological input signals using electroencephalography (EEG) and electrocardiography (ECG). EEG records electrical activity along the scalp and measures voltage fluctuations resulting from ionic current flow with the neurons of the brain (Niedermeyer & da Silva 2004). An EEG sensor is used to measure the rate and regularity of heartbeats of human heart’s electrical conduction system. According to the related studies, several techniques need to be combined to estimate the state of attention and emotion. Eye movements provide information about location of attention and the nature, sequence, and timing of cognitive operations (Lin et al. 2008). Furthermore, with the emergence of EEG technology, learner’s brainwave pattern characteristics could be measured nonintrusively and transformed into emotion state with respect to self-report conventional questionnaire (Rashid et al. 2011). EEG technology further demonstrated its measurement capability on the arousal state of the brain (Zhang and Lee 2012), alertness, cognition, and memory (Berka et al. 2004, 2007). For instance, heart-rate variability from ECG has gained widespread acceptance as a sensitive indicator of mental workload (Lin et al. 2008); besides positive emotions (PE) may change the high-frequency components of heart-rate variability (von Borell et al. 2007). For example, emWave, the emotion recognition technology is a heart-beating (stress) detector for emotional states change measurement. emWave has been used in assessing the effects of different multimedia materials on learning emotion and performance. Chen and Wang’s (2011) research results showed that the pretest score and negative emotion can predict learning performance of learners who used video-based multimedia material for learning. Furthermore, Chen and Wang (2011) found that females are more easily affected by different multimedia material. In Table 1, this study summarizes some research efforts of using physiological signals for possible emotion detection.

Table 1 Multiphysiological feature system review

Most of studies in Table 1 only reveal the implementation methods or algorithms of how to measure physiological signals for developing the effective emotion recognition. Most of these studies did not focus on issues related to learning. Therefore, we did not summarize and provide statement of these research. Hence, the research gap motivates us to examine the relationship between these physiological signals and learning outcomes.

According to the table, physiological signals of eye movement, EEG, and ECG have recently become research trends. However, a system that combines various physiological signals to recognize the affective state has not been developed yet.

Method

Research hypotheses

This study utilized two learning methods (DGBL and conventional e-learning) to study the projectile motion, one physics problem. The relationships between these two methods, learners’ attention, emotions, strategies, and learning outcomes were examined in this study. The first method enabled learners to study the topic by using SURGE physics game (Clark et al. 2011), while the second one allowed learners to study by using text descriptions, coordinates, and formula. To make fair comparison between these two learning methods, the same learning content and learning objectives were prepared for the experiment; that is, the same learning materials were presented in different methods. Figure 1 shows the research framework of this study, and the research variables are discussed in the following section.

Fig. 1
figure 1

Research framework

Research variables

Table 2 shows the input and output research variables of this study. Learner attention is recognized by the NeuroSky system; it was used to detect neuron electric triggering activity with headphone appearance. According to the NeuroSky proprietary Attention & Meditation eSense algorithms, the device can record the attention score every second. The range of attention score is from 1 to 100 (1 is the lowest attention level, and 100 is the highest attention level).

Table 2 Input and output variables in this study

Learner emotion is recognized by the emWave system, which uses human pulse physiological signals to identify Coherence score every 5 s. Coherence score has the value of 0, 1, or 2 (0 is negative emotion, 1 is peaceful, and 2 is PE). When PE becomes higher, affective experience gets higher, too (Chen and Wang 2011). The assessment of learning outcome is based on pretest and posttest results.

These eye-movement measures represent cognitive activities related to reading, comprehension, and movement of attention. To summarize the eye-movement patterns on each learning environment, five eye-movement measures were used (Yang et al. 2013): the total fixation duration (TFD), the number of fixations (NF), the average fixation duration (AFD), the percentage of viewing time (PVT), and the frequency of saccade path (FSP). Meanwhile, the following four eye-movement measures were used (Yang et al. 2013) to analyze the attention distributions on different media components (look-zones) of learning material: the percentage of time spent in zone (PTSZ), the fixation count (FC), the percentage of total fixations (PTF), and the percentage of time fixated related to total fixation duration (PTFRTFD). Table 3 provides the list of eye-movement measures and their definitions.

Table 3 Eye-movement measures and their definitions (Yang et al. 2013)

Participants

Participants were 32 university students (18 females and 14 males), aged from 19 to 26. They were randomly assigned to the DGBL and e-learning groups. All participants studied physics Newton’s law of motion in the first year of high school, before they enrolled in university. Therefore, all students were familiar with the concept. In other words, they already possessed some prior knowledge for solving related problems. All participants had good visions and passed the eye-tracking calibrations.

Procedure

Participants embarked on tests individually. On arrival, they first completed the FCI pretest (Force Concept Inventory), and then physiological sensors were attached to participants (Fig. 2). The FCI that developed from the late 1980s is designed as a test of conceptual understanding of Newtonian mechanics; it consists 30 multiple choice questions with 5 answer choices for each question. The scope of FCI covers the understanding of velocity, acceleration, and force (Hestenes et al. 1992; Manson and Olsen 2010). Next, participants were instructed the details of the subsequent experiment, learning process, and a posttest. Before learning, all of the participants passed the calibrations procedure with the eye tracker, NeuroSky and emWave. Afterward, participants were randomly assigned to the DGBL and e-learning groups to start learning phase which lasted for approximately 10 min.

Fig. 2
figure 2

Participant attached physiological sensors

All participants had to complete a pretest (Force Concept Inventory, FCI) before learning phase to evaluate participants’ prior knowledge. Learners’ academic achievement (learning outcome) was measured by a posttest (Mechanics Baseline Test, MBT). During the experiment, participants in the e-learning group had to learn with e-learning material that described three kinds of Newton’s law of motion. E-learning materials were obtained from the university textbook. The content included Newton’s three Laws of Motion, and each law was displayed on one A4 size PDF page. Participants in the e-learning group had to read all three static pages. All three pages were sequentially displayed for e-learning group students during 200 s from page 1 to page 3.

Participants in the DGBL group had to play SURGE game that included two levels, simple and advanced. This study used the SURGE physics game environment designed by Clark et al. (2011). The SURGE was built within the Unity 3D game engine (unity3d.com). The SURGE platform is intended to investigate design principles for connecting students’ intuitive “spontaneous concepts” about kinematics and Newtonian mechanics into formalized “instructed concepts” by overlaying mechanics of popular commercial video games with “marble” mechanics such as Mario Galaxy and Switch ball with formal representations and connections to formal concepts of Newtonian mechanics. The SURGE incorporates the game play design of these popular “marble” games in the context of a space-based adventure. The SURGE game belongs to the educational game type (serious game). After learning, all students completed the MBT (the posttest), and results of their learning achievement were derived (Hestenes et al. 1992). The procedure was designed, based on general recommendations from Psycharis et al. (2014).

Afterward, the eye-movement data were recorded when participants were solving the problem for the posttest. In addition, in order to understand learners’ problem-solving strategies, participants were asked to speak aloud their ideas of solving the problem. By doing so, this study could accumulate participants’ justifications which were used to double check their answer. Speaking aloud training was conducted with participants before the experiment. The posttest lasted for approximately 10 min. Participants’ eye-movement data and responses were recorded. The flow of the experiment is shown in Fig. 3.

Fig. 3
figure 3

The flow of the experiment

Correlation and multiple regression analysis of physiological signals in DGBL

Total correlation

For finding the correlation between posttest scores and learning attention (AT), PE, cognitive loads (TFD, NF, AFD, PVT, FSP), Pearson's correlation coefficient analysis was used. Table 4 shows Pearson's correlation matrix. The results show that posttest scores have no significant correlation with each physiological signal, and so these physiological signals are unable to reveal how learning state affects academic achievement.

Table 4 Pearson's Correlation Matrix

Correlation analysis: e-learning versus DGBL

For finding the significant correlation in the e-learning group between posttest scores and learning AT, PE, cognitive loads (TFD, NF, AFD, PVT, FSP), Pearson's correlation coefficient analysis was used. The correlation matrix is shown in Table 5. The results show that posttest scores have insignificant correlation with physiological signals. Thus, the physiological did not correlate with their learning achievement.

Table 5 Pearson's Correlation Matrix for the e-learning group

As for correlation in the DGBL group between posttest score and learning AT, PE, cognitive loads (TFD, NF, AFD, PVT, FSP), Pearson's correlation coefficient analysis was used, and the correlation matrix is shown in Table 6. The results demonstrate that posttest score has significantly negative correlation with cognitive loads (NF), indicating that cognitive load adversely affect the academic achievement in DGBL group. Thus, this study argues that the target DGBL could overload learners’ cognitive capacity and thus lead to a fairly unsatisfactory learning experience (Huang 2011).

Table 6 Pearson's Correlation Matrix for DGBL group

Multiple regression analysis

Based on the experimental results, we propose a formula to analyze the effects of all physiological signals (variables) to the posttest score between DGBL and conventional e-learning. The formula is listed below:

$$ {\text{Score}} = {\text{intercept}} + \beta_{1} {\text{ATT}} + \beta_{2} {\text{PE}} + \beta_{3} {\text{TFD}} + \beta_{4} {\text{NF}} + \beta_{5} {\text{AFD}} + \beta_{6} {\text{PVT}} + \beta_{7} {\text{FSP}} + \varepsilon $$

where Score denotes the learner’s score in posttest, ATT denotes learning attention, PE denotes the percentage of positive emotion, TFD denotes total fixation duration, NF denotes number of fixations, AFD denotes average fixation duration, PVT denotes percentage of viewing time, and FSP denotes frequency of saccade path \( \beta_{i} , i = \left\{ {1, \ldots ,7} \right\} \) denotes the coefficient of each variables

\( \varepsilon \) denotes the error of regression formula.

This study adopts the multiple regression method to estimate all parameters of abovementioned formula among three models. Model one was calculated based on all participants’ learning and physiological data. Model two and three were calculated based on the DGBL and e-learning’s methods and physiological data. The summary of regression model is shown in Table 7. The research results showed that only DGBL model (Model 2) had significant effect (Adj. R 2 = 0.763, F value = 9.030, p value = 0.002). Besides intercept variable, the percentage of learner’s positive emotion, and two cognitive load variables (NF, AFD) significantly influenced learning outcome (posttest score).

Table 7 Summary of regression model among DGBL and static e-learning

The results revealed that DGBL outcome was negatively influenced by learning emotion but positively by cognitive loads. That is, the higher the DGBL outcome, the higher the cognitive load, but the lower the positive emotion. High score DGBL learner has higher cognitive load (i.e., must pay more attention to obtain higher score) and has lower positive emotion. In addition, physiological signals may not have significant influence on e-learning outcome. This study assumes that the reason may be due higher cognitive load, while playing the SURGE game. However, if players feel the game is difficult to finish, it may cause a decrease in level of their positive emotion.

The multiple regression method has been used in revealing a significant model between the motivational processing (attention, relevance, and confidence) and the outcome processing based on ARCS-based instructional materials motivational survey (Huang et al. 2010). Goleman (1995) argued that emotions are directly related to and affect learning performance. Chen and Wang (2011) found that a learner’s negative emotions affected learning performances while studying with video-based multimedia material. Therefore, the negative emotion (stress) may have been a necessary factor in the learning process, but giving learners too much stress may have resulted in an unchanged learning performance. Learners’ cognitive capacities were in high demand in the DGLE. Since researchers on cognitive load have concluded that an overloaded cognitive capacity can de-motivate learners, Huang (2011) argues that the target online GBLE might overload learners’ cognitive capacity, thus leading to a fairly unsatisfactory learning experience.

Discussion, conclusion, and educational implications

This study used correlation analysis and multiple regression method to realize the relationship among learning attention, learning emotion, cognitive load, and learning achievement. Our research results showed that the relationship between physiological signals and DGBL learning outcome is significant. The learning outcome in DGBL was positively influenced by cognitive load and negatively influenced by emotion. The attention did not significantly correlate with learning outcome. We proved that emotion was directly related to and affect learning performance (Goleman 1995). Learners’ negative emotion has been proved to affect learning performance while studying with video-based multimedia material. Therefore, the negative emotion (stress) may have been a necessary factor in the learning process, but giving learners too much stress may have resulted in an unchanged learning performance (Chen and Wang 2011). Regarding cognitive load, learners’ cognitive capacities were in high demand in the online DGBL (Huang 2011).

In addition, the research results showed that academic achievement was only highly correlated with cognitive load (eye tracker signals) for DGBL learners. The results demonstrated that the posttest score is adversely correlated with the NF for DGBL learners. In addition, based on our proposed academic achievement estimation formula, it is shown that DGBL learner’s academic achievement was significantly affected by positive emotion and cognitive load (NF and AFD). In the prior DGBL research, Huang (2011) found that learners’ cognitive capacities were highly needed in the online gamed-based learning. In line with the related literature that cognitive overload could compromise learning motivation, based on our findings, we argue that DGBL learners have higher cognitive load than e-learning learners.

DGBL is considered as an effective educational tool for learning (Kebritchi and Hirumi 2008), especially to enhance learning experiences (Connolly et al. 2007) and motivation (Papastergiou 2009). Our research results showed that the outcome of DGBL is significantly correlated with learner’s learning emotion (e.g., positive emotion), but not correlated with learner’s attention (brain wave attention), which are partially in line with those arguments made in the related literature. Games that encompass educational objectives and subject matter are believed to hold the potential to render learning to become more learner-centered, easier, enjoyable, and interesting. Although games are motivational and educationally effective, the empirical evidence to support this assumption is still limited and even controversial (Marina 2009) Hence, this study pioneered in proposing a formula that included affective computing variables (brain wave, heart beating, and eye tracking) to estimate the effect of physiological signals to enhance DGBL outcome via the data collected by using scientific sensors—brain wave sensor, heart-beating sensor, and eye-tracking equipment. Based on multiple regression and correlation analyses, we pointed out that only positive emotion and cognitive load (eye information) are critical factors that influenced learning outcome.

We suggest that the future study can adopt more accurate physiological sensors such as breathing sensors, skin-conducting sensor, and face-recognition equipment to observe more related data in DGBL. Such approach will place the relationship between affective learning and academic achievement into a high perspective. Since researchers on cognitive load have concluded that an overloaded cognitive capacity can de-motivate learners, Huang (2011) argues that the target online GBLE might overload learners’ cognitive capacity, thus leading to a fairly unsatisfactory learning experience. Therefore, we suggested that future game designers should carefully design their game to maintain the balance between high learning motivation and low learning cognitive load. This will help one to avoid the overloaded cognitive capacity and to positively influence learning outcome.