
1 Introduction

Academic research about video games last for over two decades. The early research focused on the negative effects of the digital game play, particularly on the potential harm related to aggression, addiction, social isolation and depression. There has been a rising interest on the positive impact of game play for recent years. Figure 1 shows that the STEM was and still is the most commonly explored area in terms of serious game and Game-Based Learning (GBL), and suggests the surprising fact that the research interests have stayed almost the same over the past decade.

Fig. 1.
figure 1

Search results of term “serious game” and “game-based learning” given by Scopus database. Left image shows the result from 2006 to 2016. Right image shows the result of 2016 only.

According to the uses and gratifications theory developed by Katz, Gurevitch and Haas, mass media are supposed to satisfy 35 needs which can be put into five categories: cognitive needs, affective needs, personal integrative needs, social integrative needs, and tension release needs [1]. Games, one of the most popular and promising mass media, should have the potential for satisfying the above five needs. Several reviews are investigated to look into how the video or serious games fit the cognitive needs and affective needs (mainly motivation and engagement) [2,3,4,5]. However, there is no comprehensive review about game impact on mental health and moral development, focusing on young people’s personal and social integrative needs, tension release needs and affective needs. In order to provide strong evidence and confidence to inspire more researchers to work on the largely unexplored mental health benefits and moral inculcating ability of gaming, the present research is devoted to summarize empirical evidence in research of games for serious purposes for positive youth’s development in last decade.

2 Search Strategies for Literature and Games

Taking the games terms which are derived from a previous search carried out on the evaluation of computer games as a reference, the words and phrase are used in this study as follows: (“computer game “OR “video game” OR “digital game” OR “serious game” OR “games-based learning” OR “online game”). Terms for mental health or moral development are derived from consideration of terms used for affections and morality as well as specific impacts and outcomes such as, confidence, self-efficacy and empathy: (“emotion” OR “affection” OR “mental health” OR “moral” OR “moral belief” OR “social skills” OR “empathy” OR “altruism” OR “humanity” OR “confidence” OR “self-efficacy” OR “self-esteem” OR “ethical”).

In this study, we focus on the large and accessible databases which are relevant with subjects of education, information technology and social science: Scopus, Science Direct, IEEE/CSDL and PLOS.

Several detailed criteria were refined to select appropriate papers for inclusion in the review as follows: (a) the paper should be published between January 2006 and Dec 2016; (b) the paper should include empirical evidence relating to the impacts and outcomes of playing games; (c) the involved participants’ age should be between 7 and 30, which can meet the definition of young people; (d) the paper should contain positive conclusion related with mental health or moral belief and those papers; (e) the paper has an abstract.

The criteria for exclusion were as follows: (a) research only include on traditional games such as board games; (b) research only aim at unhealthy people and only include participants with mental health disorders; (c) research only focus on pleasure of engagement and motivation; (d) research only give prominence on knowledge acquisition, content understanding, perceptual, cognitive and motor skills.

We also searched serious games though the following four sources i.e. games included in selected papers, Google, YouTube and Steam. The keywords and phrases used in the searching engines are (“educational game” OR “serious game” OR “moral education game” OR “humanitarian game” OR “games for students”). For the games included in the paper, the author installed and played the game if the game or its installation could be found online. If a game was not available on the Internet, the author watched its gameplay on YouTube instead.

3 Data Analysis and Result

A total large number of papers (3820) were identified with the term indicating a huge growing interest in issues relevant to the mental health topic. However, only a few papers (26) meet with the inclusion criteria suggesting that there is still great potential in the video game researching area related to young people’s mental and moral health development.

3.1 Research Methods Used in Previous Research

Four main research methods consisting of Randomised Controlled Trials (RCTs), quasi-experiment, correlation study and qualitative study are adopted in this academic domain. Because several papers contain more than one experiment and some papers select a mixed method (e.g. RCTs and qualitative study), the total number of experiments is 38. Figure 2 shows the experiment numbers for each four methods and the proportion of Serious-Game-Based (SGB) studies and Video-Game-Based (VGB) studies.

Fig. 2.
figure 2

Number of experiments of different types included in the relevant papers

Connolly appealed to researchers for more RCT-based studies and high-quality qualitative studies [6]. After years, there is apparent ascent in RCTs, however, it’s hard to define a non-game-based control for experiments in these RCTS because game can offer additional experience (e.g. leisure, audio-visual stimuli) which are hard to simulate separately and offer to participants. In our findings, researches established three strategies to figure out the above issue as follows:

  • Introduce a “neutral game” as the control condition

  • Embed the “control condition” within the game design itself

  • Offer the similar core information and experience by other media as the control condition.

The first strategy is widely used and accepted in VGB research, especially for the ones aiming at prosocial or violence topics. Many research studies use Tetris as the “neutral game”. Gentile used pilot test to rate the selected games by young people themselves on its prosocial and anti-social aspects, and he used much more modern games (i.e. Pure Pinball or Super Monkey Ball Deluxe) validated by the pretest as the game for control condition [7]. The second strategy is perfect to align all the other influence in the control condition and treatment condition. One study splits participants (from western developed countries) into two groups (journalists and refugees) in a serious game named Haiti Earthquake, to experience different character perspectives and explore the impact of character identification on empathy [8]. Some studies use different game mode (i.e. competitive mode and cooperative mode) as the different conditions. The third strategy is also commonly used and normally infers more reliable conclusions as a control condition. In the research of sympathy for homeless people, reading a story based on a homeless character, which can offer similar information compared to the Homeless gameplay, is used as one of the control approaches [9]. A more creative control group example is asking participants to conduct internet research and create a PowerPoint presentation of a different country’s life. In this case, by self-learning with the internet, participants can not only get similar information as the treatment group but also a similar experience due to this active learning process [10].

There is no correlation study of serious games at the moment. There may emerge some correlation studies in the future in order to have a better overview as the acceptance and quality of serious games improves.

3.2 Participants Involved in Previous Research

Participants are coded according to education levels. A large sample of participants (13694) are involved in these selected papers and Fig. 3 shows a wide range of young people covering primary, secondary, tertiary and higher education. Participants are also spread across many different countries. For example, in the study of international evidence of prosocial video games on prosocial behavior, participants from the United States, Japan and Singapore are picked [7]. This quantity and variety of participants contributes to the reliability of conclusions of these papers. It seems that academia lays more emphasis on the primary and middle school students, which cover more than 60% of the participants, however, the amount of high school participants, which is less than 500, is surprisingly low. This dearth of high school students suggests that more academic attention should be paid to these adolescents.

Fig. 3.
figure 3

Number of involved participants at different education phases

3.3 Genre of Games for Serious Purposes

The majority of classifications focus on the commercial video game [11], but it is not useful for analysis of games for serious purposes. Johnson and Hall’s JDCS model is a good basis for an alternative taxonomy of game genre. The model operates with three main dimensions: job demands, job decision latitude and job social support. In a video game context, the three dimensions refer to purpose, actions and game mode. Purpose indicates the objective of a game or a segment of a game, for example, the objective of video game Lemmings is to guide as many Lemmings as possible from the entrance to the exit. Actions imply each step players need to take in order to achieve the task purpose, for example, the action of Lemmings is to assign different skills to different Lemmings to clear the obstacle or create a safe passage through the landscape. Game mode refer to whether it is a single- or multi-player game, in other words, the type of connections between players and how strong the connection is. Game context, which specifies the background story of a game or the storyline of a game, is another important dimension of serious games.

In conclusion, in order to reach a prosocial educational purpose, there are four approaches including purpose-driven, action-driven, mode-driven and context-driven according to the above analysis of four dimensions of a game. Table 1 shows 20 games we found by our searching strategies. If a game does not have clear objectives, the game is excluded from the purpose-driven approach, such as Real Lives, Nintendogs, Killbox, Haiti Earthquake, The Sims, Gone home etc. The aim of these games is to go through an experience instead of accomplishing certain goals. It should be noted that a game is more likely to be boring without an explicit purpose.

Table 1. Selected games classified by four driven mode

As to a well-designed action-driven game, the players are supposed to take a series of prosocial and purposive actions in order to complete the task. However, some game designs are over-simplified so that players can only take limited and straightforward actions. For example, in Homeless: It’s No Game, the only action that players can take is to move the virtual character to collect food to keep alive in the street. These kind of games are excluded from the action-driven approach, because these actions cannot influence players’ thinking and reflection on their behaviours or the game content.

There are only 3 games adopting mode-driven approach. is a massively multiplayer online (MMO) game and it requires teamwork to gain as much mass as possible by swallowing smaller cells without being swallowed by bigger ones. Don’t Starve Together also requires players to share collected resources and work together to get over the dark cold night.

A context-driven approach has three sub-types. The first type is to use stories as background information, with the game play not affecting the story at all, such as 3rd World Farmer, Killbox and Stop Disasters!. The second type is to embed the story into the game and players can understand and alter the storytelling of the game during game play, e.g. Haiti Earthquake, ZooU, Darfur is Dying, Real Lives, FearNot!. The third type is to use narrative as a motivation for players to play, such as Gone home, Inside, The Sims.

In summary, many games which only take one approach actually could be better designed. Besides, pure entertainment game like League of Legends also experimented in curbing players’ negative behaviour by introducing “Honor initiative” and “Tribunal system”, which encourage players to commend the others for excellence in teamwork and mete out warnings and bans for negative player behaviour. As a result, negative chat saw a decrease by 32.7% and positive chat went up by 34.5% [12]. It is encouraging that big companies like Riots Games is paying attention and instituting strategies to players’ mental health and moral development now, and it further support the idea that players’ attitude, behaviour and psychological state could be effectively influenced by well-designed games.

3.4 Assessment in and Out of Game

Two types of assessment of serious games are discussed in this section: (1) in-game assessment that contains all the assessment embedded in the virtual game world, and (2) assessment out of the game, which includes assessments taking place in the real world.

With regard to real world assessment, there are 6 common approaches: (1) surveys, (2) pre- and post-questionnaires that are often validated with a decent Cronbach’s α value, (3) pre- and post-questionnaires with a follow-up one (e.g. several weeks or months after the experience) to assess long-term lasting effects, (4) situational judgment tests referring to contextual tests and behavioural tests, e.g. completion task of word fragments list [13], sequential modified prisoner’s dilemma game (PDG)Footnote 1, and helping tasks such as picking up intentionally spilled pens [14] or returning lost letter [15], etc., (5) interviews or participants’ self-reporting, (6) mixed approaches.

In the identified papers, most research deployed the second approach of pre- and post-questionnaire which were less convincible. 5 SGB and 5 VGB research used the third approach. 7 VGB and only one SGB research further adopt the fourth approach.

Another interesting finding is the “real-time” assessment used by three studies which collect the Heart Rate Variability (HRV) data [16, 17]. electroencephalogram (EEG) [18] and during the participants’ gameplay for analysis. Another in-game assessment strategy is a bi-feedback game design mechanism: according to participants’ decision or choice, the game can give a response by leading to different situation (e.g. different narrative branch, different character personality) and this response should also influence participants subsequently. Within the relevant papers, three games take the storyline branch design [8, 19, 20], and one game take the character personality design [21].

3.5 Categorization of the Game Impacts

Impacts of playing games are analysed in terms of the following two aspects.

Area of impact.

The aim of this review is to find empirical evidence that support games can be an effective vehicle for inculcating moral values and improving mental health of young people. According to the selected papers, the most frequent research emphases are prosocial emotion and behavior (11) and empathy (6) followed by social moral (3), emotion regulation (3), bully prevention (2) and self-esteem (1).

Impact outcome effectiveness.

Effectiveness is an important aspect of the empirical evidence. Three degrees of effectiveness are proposed for this paper: affective outcome, behavioural outcome and psychological outcome. By using experiments with questionnaires or survey, the conclusion can only be viewed as an attitude outcome. By using properly designed experiments containing situational judgment tests including high-cost contextual tests or behavioural tests, the findings can be regarded as a behavioural outcome. By using long-term well-designed experiments or short-term experiments with an additional follow-up questionnaire, the findings can indicate a lasting psychological outcome. In accordance with expectation, the most frequently outcome effectiveness is affective outcome (14), the easiest to prove, is more than the sum of behaviour (6) and psychology (6).

4 Discussion

4.1 Discussion of Assessment Design

Unlike knowledge acquisition, moral beliefs and affection is very hard to assess directly and people tend to hide their true thoughts when they fill a questionnaire. The approach of a simple questionnaire is not persuasive enough. As the data showed in Sect. 3, VGB research generally has more successful assessment design than SGB research, especially in respect of situational judgment tests which consist of contextual tests and behavioural tests. It should be noted that there are different levels of these tests ranged from low-cost to high-cost.

Picking up the spilled pens [22,23,24] is one of the lowest; returning the lost envelope [25] and offering help to be a volunteer for someone’s master thesis are higher [24]; the highest cost design is attempting to stop an aggressive intimidation of a male confederate, also found in Greitemeyer’s study of prosocial behavior [24]. Greitemeyer made further contribution to the “spilled pens scenario” by introducing an independent experimenter who was not aware of the participant’s experimental condition to spill the pencils, and the above modification avoid the potential influence of a dependent experimenter [24].

In contextual tests, completing an ambiguous story by listing possible actions of the main character [26], reactions after reading stories of actual persons or fictions [27], and completion task of many word fragments list in limited time [26] are lower cost, while behavior in a modified PDG and assigning partners tangram puzzles games of different difficulty to prevent them from getting reward are higher cost.

In a word, high-cost situational judgment tests are highly recommended as one of the assessment approach for the future research in this area. Besides, there is still a paucity in high-quality qualitative designs. The five papers (i.e. [9, 21, 28,29,30]) either use a qualitative approach (e.g. interview, report) as an adjunct of other research methods or are poorly organized with basic questions, flat analysis and absence of raw data from participants.

In addition, more emphasis should be put on real-time assessment. Though an overall meaningful interpretation of the physiological data is not provided and is unsolved [31], modern physiological measures are still promising and future measures of assessment (including facial electromyography (EMG) for measuring facial muscle activity through the detecting of generated electrical impulses [32], cardiovascular measures (e.g. interbeat interval, heart rate) [32], Galvanic Skin Response (GSR) for measuring the electrical conductance of the skin, Electroencephalography (EEG) for measuring the electrical activity along the scalp [33], and eye tracking for measuring either the point of gaze or the motion of an eye). Researchers should keep tracking the latest advances in this field and employ it. Physiological measurement is in particular effective when participants tend to hide their true feelings or reactions. For example, participants pretend to be fearless when they stand on clifftops in a virtual environment. In this case, it is difficult to judge only by a questionnaire or even observation, however, physiological data such as GSR can help researchers to reveal the truth.

In-game real-time assessment offers seamless experience for players and are more reliable. For example, in the game MindLight, by using an EEG headset that converts brain waves to the intensity of a head light of the avatar in the game, the more relaxed the players become, the brighter the “mindlight” shines, and the light is the only way that players can see in the dark haunted house [18]. Except physiological data like GSR, EEG or EMG, the movement of the body is also a possible input. Another design is to use players’ decisions or actions as input, and the game reciprocates an “assessment” by altering narrative or character’s personality. An excellent example is an interactive fantasy mystery adventure game named The Wolf Among Us. Players need to keep making decisions within the game, e.g. beating someone or persuading someone, each decision can not only lead to different storyline but also affect the main character’s disposition to be gentler or more violent. The third possible design is to embed the survey, interview or self-report inside the game. The benefit of this design can offer a seamless experience and induce more authentic idea of the participants. For example, in the game Haiti Earthquake which offers two perspective (i.e. journalist and survivor role) to experience the aftermath of Haitians’ earthquake [8], it could be a better instrument to plant the qualitative measure into the game itself by asking the player in the journalist’s perspective to write daily news report and players in the survivor’s perspective to write daily diary.

4.2 Discussion of Game Impact

As Sect. 3 revealed, games for serious purposes can have an impact on several areas including prosocial, empathy, emotion regulation, etc. It can also change player’s attitudes and behavior and influence their psychological state.

Prosocial video game is the most investigated area. Several studies demonstrate that people tend to have more prosocial attitude and behaviors after playing a prosocial game even for a short time (8–25 min) [7, 22,23,24, 26, 34, 35]. One interesting finding is that five of the above seven studies adopt contextual tests or behavioural tests as assessment approach, and after a short-term prosocial video game play people incline to offer high-cost help to protect a girl from her ex-boyfriend [24]. Besides, the seven studies cover different prosocial games such as Lemmings, Chibi Robo, Wii Sports, NBA Street and a customized game. Study of Lemmings, which is a prosocial purpose-driven and prosocial action-driven game, reveals that the game play can reduce antisocial thoughts and encourage prosocial thoughts and positive feelings through assessment of ambiguous story stems and word fragments completion [26]; Study of Wii Sports and NBA Streets, which are cooperative mode-driven game, shows that cooperative game mode can motivate prosocial behavior [23]. Because the positive impact on prosocial reciprocity expectations, playing with a helpful teammate can further lead to an increased donation to strangers in the sequential assessment of prisoner’s dilemma game [34]. The above studies provide strong evidence of positive effect from games with prosocial task purpose, actions and game mode. However, there is no study on impact of the context-driven prosocial game, for example the difference between a neutral game with and without a prosocial game context.

Three studies focused on the long-term effect of the prosocial game play by using questionnaire and survey on a large sample of people. One important finding of a study suggests that prosocial behavior tendencies and prosocial game playing are positively connected with each other and improve each other, and this study provides stronger evidence for a causal long-term relationship between prosocial game play and prosocial behavior. [7]. Another important finding of a long-term longitudinal study shows that both the prosocial- and the violent-game effects on prosocial behavior were mediated by changes in empathy and provides evidence that prosocial-media use can lead to long-term increases in trait empathy and helping [36].

Empathy also attracts great attention from researchers. Six studies about empathy are more diverse with a different research focus based on six different games. One study shows the great impact of game context by introducing the background story of Superman and The Joker to the participants before playing the video game Mortal Combat vs. DC Universe. The background stories, which make the participants have more empathy for Superman and The Joker, unexpectedly serve as an amplifier for already established attitudes and cognition widening the gap between “good” attitude and “evil” attitude [25]. Lemmings is also used to support the positive influence of prosocial games on empathy, and this study demonstrates that a prosocial-task-based game without any background story could also be helpful for empathic concern [27].

Several studies focus on game impact on moral education and bully prevention, but most of them use poorly-designed game. The only study based on a high-quality commercial serious game ZooU concluded mixed results. After 10-week intervention trial, children who participate in Zoo U showed significant improvements in impulse control, social initiation skills and cooperation, but also an increase on social withdrawal and anxiety level at the same time. Because social skills including moral and mental health development are particularly sophisticated and need long-term intervention, more qualitative study designs should be used and studies should focus on impact over 6 month or even a longer period.

It should be specially noted that researchers must carefully choose or design a game to investigate or support their research question and hypothesis. In order to prove the game is not poorly designed, the in-game experience including presence and flow should be checked. Compared to commercial games, serious games normally have much worse game experience and visual image and the poor experience could distort participants’ feelings and result in a different research conclusion. For example, Nintendogs developed by Nintendo has vivid image and animation of a virtual 3D dog. A study based on this game concludes a computer-simulated virtual pet dog might be able to influence children’s development of humane attitudes and empathy [21]. If the study based on a pixelate virtual pet dog (just as the virtual pet games in 1990s), the research result is probable to be different.

In summary, studies about prosocial are relatively high-quality, while studies about empathy, which examine the game impact for empathy by several different games with different assessment approaches, are also good. Because the poorly-designed games are used, studies about moral issues are relatively low-quality and the results are mixed. Furthermore, regarding to moral and mental health issues, there are great potential for a wider range of areas such as domestic violence and the current refugee problem.

The authors are currently working on a research project about history empathy in terms of Holocaust education for young people and developing a serious game for the National Holocaust Centre and Museum. The game features interactive storytelling about the Kindertransport programme that helped ten thousand children escape the Holocaust. The game utilizes the latest Augmented Reality (AR) equipment HoloLens. This AR technology has an amazing ability to mix virtual 3D characters into real world environments. According to findings of this paper, the game will be developed using the latest 3D game engine to guarantee a vivid image and animation so that players can have a good in-game experience. The game will be designed in a mixed way which combines purpose-driven, action-driven and context-driven approaches. To be specific it will set up a certain task purpose with a series of actions that players should take, and will use narrative as a motivation to encourage players to dig into the story, similar to the video game Gone Home. To ensure the reliability of the conclusion, contextual or behavioral testament would be adopted as the assessment approach. Besides, qualitative research methods including individual interview, observation and personal report will also be taken into account. This AR based study can contribute to the study of serious games’ impact regarding empathy and understanding of the connotations of moral education based on serious games in a museum environment.

5 Conclusion

The current review provided empirical evidence to support the educational effectiveness of games in moral and mental development of young people. The four dimensional analysis of the game proposed in the current study has helped to provide an innovative new approach for classifying and analyzing games for serious purpose.

Studies about prosocial games are of relatively good quality, and the previous studies only covers parts of the area in terms of prosocial, social moral, bully prevention, empathy, emotion regulation etc. There is great potential for a wider range of areas regarding mental health and moral development. In order to avoid distorting the emotions of players and collect the reliable data from the players, it is essential to use or design a game with a good in-game presence and flow experience and it is important to employed well-designed game assessment including in- and out of- game assessment.

As most studies adopt quantitative study design such as RCT and quasi-experiment and there exist not enough studies which have been conducted over a long time span. There is a need for more high-quality qualitative study and more longitudinal studies to provide deeper understanding of the effectiveness and advantages of game-based education.