Keywords

1 Introduction

Educational assessment practice is challenging as there are a number of diverse concepts referring to the idea of assessment. Newton (2007) laments that the distinction between formative and summative assessment hindered the development of sound assessment practices on a broader level. Black (1998) defines three main types of assessment: (a) formative assessment to aid learning; (b) summative assessment for review, for transfer, and for certification; and (c) summative assessment for accountability to the public. Pellegrino, Chudowsky, and Glaser (2001) extend these definitions with three main purposes of assessment: (a) assessment to assist learning (formative assessment), (b) assessment of individual student achievement (summative assessment), and (c) assessment to evaluate programs (evaluative assessment). A common thread among the many definitions points to the concept of feedback for a variety of purposes, audiences, and methods of assessment (Ifenthaler, Greiff, & Gibson, 2018).

Digital game-based technologies are nudging the field of education to redefine what is meant by learning, instruction, and assessment. Proponents of game-based learning argue that students should be prepared to meet the demands of the twenty-first century by teaching them to be innovative, creative, and adaptable so that they can deal with the demands of learning in domains that are complex and ill-structured (Federation of American Scientists, 2005; Gee, 2003; Ifenthaler, Eseryel, & Ge, 2012; Prensky, 2001; Shaffer, 2006). On the other hand, opponents of games argue that games are just another technological fad, which emphasize superficial learning. In addition, opponents arguethat games cause increased violence, aggression, inactivity, and obesity while decreasing prosocial behaviors (Walsh, 2002).

However, Ifenthaler et al. (2012) argue that the implementation of assessment features into game-based learning environments is only in its early stages because it adds a very time-consuming step to the design process. Also, the impact on learning and questions toward reliability and validity of technology-based assessment systems are still being questioned. Three distinguishing featuresof game-based assessment have been proposed and are widely accepted: (1) game scoring, (2) external, and (3) embedded assessment of game-based learning (Ifenthaler et al., 2012). Only recently, an additional feature has been introduced which enables adaptive gameplay and game environments, broadly defined as learning analytics (Ifenthaler, 2015) and specifically denoted as serious games analytics (Loh, Sheng, & Ifenthaler, 2015). Serious games analytics converts learner-generated information into actionable insights for real-time processing. Metrics for serious games analytics are similar to those of learning analytics including the learners’ individual characteristics (e.g., socio-demographic information, interests, prior knowledge, skills, and competencies) and learner-generated game data (e.g., time spent, obstacles managed, goals or tasks completed, navigation patterns, social interaction, etc.) (Ge & Ifenthaler, 2017; Ifenthaler, 2015; Loh, Sheng, & Ifenthaler, 2015).

This chapter seeks to identify why research on game-based assessment is still in its infancy, what advances have been achieved over the past 10 years, and which challenges lie ahead for advancing assessment in game-based learning.

2 Game-Based Assessment and Assessment of Learning in Games: Why?

Games—both digital and nondigital—have becomean important aspect of young people’s life. According to a recent survey conducted in the United States, 72% of youth ages 13–17 play games daily or weekly (Lenhart, 2015). Gaming is also one of the most popular social activities, especially for boys, where 55% of them play games in-person or online with friends daily or weekly. While gaming gained more popularity in people’s daily life, starting in early 2000, educational researchers began to investigate potential educational benefits of games for learning and what we can learn from well-designed games about learning and assessment (Gee, 2003).

So what are affordances of games for learning? First, people learn in action in games (Gee, 2008). That is, people interact with all aspects of the game and take intentional actions within the game. For its part, the game continuously responds to each action, and through this process, the player gradually creates meaning. Clearly, how people are believed to learn within video games contrasts to how people typically learn at school, which often entails memorization of decontextualized and abstract concepts and procedures (Shute, Ventura, Bauer, & Zapata-Rivera, 2009). Second, due to its interactive nature, learning by playing games can lead to conceptual understanding and problem-solving (Eseryel, Ge, Ifenthaler, & Law, 2011) in addition to domain-specific skills and practices (Bressler & Bodzin, 2016) that go beyond the basic content knowledge more commonly taught in the classroom. Steinkuehler and Duncan (2008) have found players in virtual worlds frequently engaging in social knowledge construction, systems-based reasoning, and other scientific habits of mind. This body of work shows that games in general have a lot of potential for contributing to a deep learning environment. In video games, players engage in active and critical thinking, they take on different identities, and they have opportunities to practice skills and find intrinsic rewards as they work on increasingly difficult challenges on their path to mastery (Eseryel, Law, Ifenthaler, Ge, & Miller, 2014; Gee, 2003).

Numerous studies have reported the benefits of games for learning as a vehicle to support student learning. In a meta-analysis study, Clark, Tanner-Smith, and Killingsworth (2016) reported that compared to nongame conditions, digital games had a moderate to strong effect in terms of overall learning outcomes including cognitive and interpersonal skills. Similarly, a literature review by Boyle et al. (2016) reports that games are beneficial for learning of various outcomes such as knowledge acquisition, affect, behavior change, perception, and cognition. Numerous studies also reported academic domain-specific benefits of games for learning including science and mathematics (Divjak & Tomić, 2011). To answer the question of what people are learning from playing games, researchers have been using a variety of methods including external measures, log data capturing in-game actions, and game-related actions beyond the game context (Ifenthaler et al., 2012; Loh et al., 2015).

3 Game-Based Assessment: Past 10 Years

Several meta-analyses have been published focusing on game-based learning. For example, Baptista and Oliveira (2019) highlight important variables in their literature search of more than 50 studies focusing on serious games including intention, attitude, enjoyment, and usefulness. A systematic review by Alonso-Fernández, Calvo-Morata, Freire, Martínez-Ortiz, and Fernández-Manjón (2019) focuses on the application of data science techniqueson game learning data and suggests specific game learning analytics. Ke (2016) presents a systematic review on the integration of domain-specific learning in game mechanics and game world design. Another systematic review by Ravyse, Seugnet Blignaut, Leendertz, and Woolner (2017) identifies five central themes of serious games: backstory and production, realism, artificial intelligence and adaptivity, interaction, and feedback and debriefing. Accordingly, none of the abovementioned meta-analyses and systematic reviews have a clear focus on assessment of game-based learning.

Still, a line of research that emerged over the past 10 years was in relation to the question of how we can use games as an interactive and rich technology-enhanced environment to advance assessment technologies. That is, the primary goal of this line is to advance assessment using games (Ifenthaler et al., 2012). Earlier game-based assessment work has primarily focused on applying the evidence-centered design framework to develop assessment models with specific learning outcomes and skills in mind (Behrens, Mislevy, Dicerbo, & Levy, 2012). For example, Shute et al. (2009) describe an approach called stealth assessment—where in-game behavioral indicators (e.g., specific actions taken within a quest in Oblivion) are identified and make inferences about the player’s underlying skills (e.g., creative problem-solving) without the flow of gameplay using logged data. Using this approach, one can use existing games to measure latent constructs, even if the game was not explicitly developed for the purpose of learning or assessment, as long as the game provides ample contexts (or situations) that elicit evidence for underlying skills and constructs (Loh et al., 2015). Similarly, using a popular game SimCity, GlassLab developed SimCityEDU to assess students’ systems thinking (Dicerbo et al., 2015). These approaches have primarily used the evidence-centered design framework (Almond, Steinberg, & Mislevy, 2002) to align what people might learn from the game with what they do in games.

Eseryel, Ifenthaler, and Ge (2011) provide an integrated framework for assessing complex problem-solving in digital game-based learning in the context of alongitudinal design-based research study. In a longitudinal field study, they examined the impact of the massively multiplayer online game (MMOG)Surviving in Space on students’ complex problem-solving skill acquisition, mathematics achievement, and students’ motivation. Two different methodologies to assess student’s progress of learning in complex problem-solving were applied. The first methodology utilized adapted protocol analysis (Ericsson & Simon, 1980, 1993) to analyze students’ responses to the given problem scenario within the framework of the think-aloud methodology. The second methodology utilized HIMATT methodology (Eseryel, Ifenthaler, & Ge, 2013; Pirnay-Dummer, Ifenthaler, & Spector, 2010) to analyze students’ annotated causal representations of the phenomena in question. The automated text-based analysis function of HIMATT enables the tracking of the association of concepts from text which contain 350 or more words directly, hence producing an adaptive assessment and feedback environment for game-based learning. For future game design, the algorithms produce quantitative measures and graphical representations which could be used for instant feedback within the game or for further analysis (Ifenthaler, 2014).

More recently, researchers have introduced learning analytics and data mining techniquesto broaden what game-based assessment means (Loh et al., 2015). For example, Rowe et al. (2017) built “detectors” machine-learned algorithm using log data in the game to measure implicit understanding of physics, different strategies associated with productivity in the game, and computational thinking. While they did not use formal measure models (e.g., IRT or Bayes net), these detectors are implemented in the game engine to make real-time inferences of players. Similarly, Shadowspect developed at MIT Playful Journey Lab (Kim & Rosenheck, 2018) is another example of GBA that utilizes new advancements in learning analytics and educational data mining techniques in the process of game design and development for the purpose of assessment.

Hence, the application of serious games analytics opens up opportunities for the assessment of engagement within game-based learning environments (Eseryel et al., 2014). The availability of real-time information about the learners’ actions and behaviorsstemming from key decision points or game-specific events provides insights into the extent of the learners’ engagement during gameplay. The analysis of single action or behavior and the investigation of more complex series of actions and behaviors can elicit patterns of engagement and therefore provide key insights into learning processes (Ge & Ifenthaler, 2017).

Ifenthaler and Gibson (2019) report how highly detailed data traces, captured by the Challenge platform, with many events per learning activity and when combined with new input devices and approaches bring the potential for measuring indicators of physical, emotional, and cognitive states of the learner. The data innovation of the platform is the ability to capture event-based records of the higher-frequency and higher-dimensional aspects of learning engagement, which is in turn useful for analysis of the effectiveness and impact on the physical, emotional, and cognitive layers of learning caused or influenced by the engagements. This forms a high-resolution analytics base on which research into digital learning and teaching as well as into how to achieve better outcomes in scalable digital learning experiences can be conducted (Gibson & Jackl, 2015).

4 Challenges and Future Work

While interests for game-based assessment peaked in 2009 when the GlassLab was launched to scale up this approach in the broad education system, many promises of game-based learning and assessment have not fully accomplished in the actual education system. Based on the reflection of the fields’ achievements in the past 10 years and contributions to the current volume, challenges remain that the field of game-based assessment still faces as well as future work that researchers, game designers, and educators should address to transform how games are used in the education system.

While ECD has been the most predominant framework to design assessment in games, it is often unclear how different development processes leverage ECD to conceptualize game design around the competency of interest (Ke, Shute, Clark, & Erlebacher, 2019). For example, how can assessment models be formalized? How can formalized assessment models be translated to game design elements? When in the game design process, does this translation occur most effectively? How can competency models be transformed into interesting, engaging game mechanics? How can psychometric qualities be ensured without being too prescriptive?

Many established game-based assessment approaches focus on understanding the continuous progression of learning, thinking, reasoning, argumentation, and complex problem-solving during digital game-based learning. From a design perspective, it seems important that the game mechanismsaddress the underlying affective, behavioral, and cognitive dispositions which must be assessed carefully at various stages of the learning process, hence, while conceptualizing and designing games for learning (Bertling, Jackson, Oranje, & Owen, 2015; Eseryel et al., 2014; Ge & Ifenthaler, 2017).

Advanced data analytics methodologies and technological developments enable researchers, game designers, and educators to easily embed assessment and analysis techniques into game-based learning environments (Loh et al., 2015). Internal assessment and instant analysisincluding personalized feedback can be implemented in a new generation of educational games. However, it is up to educational research to provide theoretical foundations and empirical evidence on how these methodologies should be designed and implemented. We have just arrived in the age of educational data analytics. Hence, it is up to researchers, technologists, educators, and philosophers to make sense of these powerful technologies, thus better help learners to learn.

With the challenges brought on by game-based assessments including data analytics, the large amount of data now available for teachers is far too complex for conventional database software to store, manage, and process. Accordingly, analytics-driven game-based assessments underscore the need to develop assessment literacy in stakeholders of assessment(Ifenthaler et al., 2018; Stiggins, 1995). Game designers and educators applying data-driven game-based assessments require practical hands-on experience on the fundamental platforms and analysis tools for linked big game-based assessment data. Stakeholders need to be introduced to several data storage methods and how to distribute and process them, introduce possible ways of handling analytics algorithms on different platforms, and highlight visualization techniques for game-based assessment analytics (Gibson & Ifenthaler, 2017). Well-prepared stakeholders may demonstrate additional competencies such as understanding large-scale machine learning methods as foundations for human-computer interaction, artificial intelligence, and advanced network analysis (Ifenthaler et al., 2018).

The current research findings also indicate that design research and development are needed in automation and semi-automation(e.g., humans and machines working together) in assessment systems. Automation and semi-automation of assessments to provide feedback, observations, classifications, and scoring are increasingly being used to serve both formative and summative purposes in game-based learning.

Gibson, Ifenthaler, and Orlic (2016) proposed an open assessment resources approach that has the potential to increase trust in and use of open education resources (OER) in game-based learning and assessment by adding clarity about assessment purposes and targets in the open resources world. Open assessment resources (OAR) with generalized formative feedback are aligned with a specific educative purpose expressed by some user of a specific OER toward the utility and expectations for using that OER to achieve an educational outcome. Hence, OAR may be utilized by game designers to include valuable and competence-based assessments in game-based learning.

The application of analytics-driven game-based assessments opens up opportunities for the assessment of engagement and other motivational (or even broader: non-cognitive) constructs within game-based learning environments (Eseryel et al., 2014). The availability of real-time information about the learners’ actions and behaviors stemming from key decision points or game-specific events provides insights into the extent of the learners’ engagement during gameplay. The analysis of single action or behavior and the investigation of more complex series of actions and behaviors can elicit patterns of engagement and therefore provide key insights into ongoing learning processes within game-based learning environments.

To sum up, the complexity of designing adaptive assessment and feedback systems has been discussed widely over the past few years (e.g., Sadler, 2010; Shute, 2008). The current challenge is to make use of data—from learners, teachers, and game learning environments—for assessments. Hence, more research is needed to unveil diverse methods and processes related to how design teams, often including learning scientists, subject-matter experts, and game designers, can seamlessly integrate design thinking and the formalization of assessment models into meaningful assessment for game-based learning environments.