According to The Autism and Developmental Disabilities Monitoring (ADDM) Network and a corresponding report authored by Christensen et al. (2016), approximately 1 in 68 children are identified with an autism spectrum disorder (ASD). The prevalence of autism is growing and is impacting all racial, ethnic, and socioeconomic statuses. In North America alone, studies have shown that individuals with ASD have been diagnosed with an average prevalence of approximately 1–2 %.

One of the core characteristics of an ASD is a deficit in social communication and interaction (American Psychiatric Association 2013). Part of the social communication deficit has been problems “reading” facial expressions or emotion recognition (Hobson 1993). In the context of this review, emotion recognition is defined as the ability to discriminate an emotion from observing a facial expression. According to Ekman and Friesen (1976), there are six basic emotions: happiness, sadness, anger, fear, disgust, and surprise. In addition, there are nine complex emotions: excited, tired, unfriendly, kind, sorry, proud, jealous, joking, and ashamed.

Considered to be fundamental in the development of a child, emotion recognition has been closely linked to socio-emotional skills and social competence (Lierheimer and Stichter 2012; Young and Posselt 2012; Uljarevic and Hamilton 2013). Widen and Russell (2003) found that neurotypical children recognized and labeled the six basic facial expressions portraying emotion by the age of 3 years. However, in contrast, Scambler et al. (2007) found that at the age of 2 years, children with ASD were less responsive to facial expressions exhibiting joy, fear, disgust, and pain when compared to typically developing, age-matched children. Furthermore, Williams (2013) found associations between deficits in emotion recognition and social competence, as measured by the Vineland-II Socialization domain (Sparrow et al. 2005), amongst 42 young participants identified with autism. As a result, the ability of individuals with ASD to recognize and understand emotions emitted by people may be crucial to addressing the population’s deficits in social communication and interaction.

An inability to recognize emotions may relate to the tendencies for individuals with ASD to speak at length about a preferred topic without considering the interest of or the impact on the people with whom they are communicating (Hopkins et al. 2011). The interaction between the individual with autism and another person is thus one-sided, instead of reciprocal. Deficits in emotion recognition in individuals with autism are also related to a lack of eye contact, decreased curiosity toward faces or abnormal processing of human faces, and difficulties understanding nonverbal behaviors, such as body language and gestures (Hobson 1986; Hopkins et al. 2011).

The ability to discriminate emotional expressions from facial expressions is also related to Theory of Mind (ToM), which is an inability to identify another’s perspective or their “mental states: that is, knowing that other people know, want, feel, or believe things” (Baron-Cohen et al. 1985, p. 38). “As early as the preschool years, theory of mind ability has been associated with the capacity of children to engage in and sustain pretend play with peers. Individuals on the autism spectrum experience delays in theory of mind, and these delays have significant effects on their social development” (Myszak 2010, p. 1). In a study by Baron-Cohen et al. (1985), individuals with autism failed to complete ToM scenarios, which showed a distinct deficit in this population. In another study, Baron-Cohen et al. (1997) developed a ToM assessment and found that individuals with ASD had substantial difficulties with determining emotions from viewing photographs of faces and segments of faces around the eyes when compared to typically developing children of the same age and IQ. Lacava et al. (2007) noted that individuals with ASD experience difficulties with recognizing emotions due to deficits in taking on the perspective of another person.

Furthermore, in a study by Jones and Klin (2013), eye contact as found to be a potential diagnostic feature for ASD. Using eye-tracking technology, the authors found that infants who would later be diagnosed with ASD showed marked declines in eye contact. Decreased orientation toward eyes may be associated with deficits in social engagement in this population. According to Hopkins et al. (2011), individuals with ASD also view faces differently by focusing on specific features, which may impact how they process emotions shown on the face and make social judgments. In another study using eye-tracking technology, Spezio et al. (2007) found that individuals with autism predominately focused on the mouth region when viewing and identifying facial expressions. In comparison, neurotypically developing participants viewed the eye area more when recognizing emotions. Jones and Klin (2013) also came to similar conclusions about infants’ fixations on the mouth area and a diagnosis of ASD. The processing of faces by individuals with ASD may be due to a preference for systems or systemizing. Systemizing is a desire to examine and construct systems which allow an individual to better predict and control the movements of the system. The systems may consist of mechanical structures, such as vehicles, number patterns, natural systems, and collections of items (Golan et al. 2010). The theory of systemizing is also connected to ToM in individuals with autism. Based on both theories, individuals with ASD attend to faces less and have difficulties recognizing emotions and empathizing due to their preference for predictable systems and their difficulties with interpreting the complicated facial expressions of others (Golan et al. 2010).

Past research also suggested that individuals with ASD have an affinity for technology and coupled with continuing advancements in the technological field, which has thus led investigators to increasingly examine how deficits in the population may be addressed through this modality. However, although promising, the application of technology in instructing individuals with ASD needs continued research to demonstrate its impact (Ploog et al. 2013).

Interventions targeting emotion recognition skills have the potential to address and ultimately improve deficits in social communication and interaction in individuals with autism. Bearing in mind the population’s affinity for technology, the purpose of this literature review is to investigate the research on interventions that employ technology and target emotion recognition skills.

Method

Empirically based literature on programs targeting emotion recognition was selected through electronic and ancestral searches of studies published between 2000 and 2016. The following databases were used: Academic Research Complete, Education Research Complete (EBSCO), and ERIC. The keywords used to generate the electronic search included autism, autistic, ASD, emotion recognition, facial expression recognition, affect, and intervention.

The titles of the studies were reviewed, and those that included the abovementioned keywords and did not focus on neurology or genetics were included for closer examination. After reading the abstracts of the resulting literature and, if necessary, reviewing the entire article, the results were narrowed further so that only articles available in full text and published in English from peer-reviewed journals were incorporated. Literature on interventions that did not utilize technology was excluded, as well as pilot studies and dissertations. Figure 1 presents the detailed process used to identify these studies.

Fig. 1
figure 1

Literature review process

Both the electronic search and ancestral search yielded 10 articles that investigated technology-based interventions targeting emotion recognition. The periodicals in the search included Autism, Child and Family Behavior Therapy, The Journal of Autism and Developmental Disorders, The Journal of Child Psychology and Psychiatry, Psychology in the Schools, and Journal of Nonverbal Behavior. The articles included in the review of the literature used single-subject, quasi-experimental, group experimental research methods, randomized-control trial, and randomized block design.

Results

The following studies investigated the effectiveness of technology-based interventions designed to increase emotion recognition skills in individuals with autism. The studies are organized based on the intervention program explored in order to facilitate the succeeding discussion on the efficacy of specific interventions. Table 1 presents an overview of these 10 studies.

Table 1 Technology-based interventions targeting emotion recognition

Discrete Trial Training

Only one study was found that used a discrete trial approach. McHugh et al. (2011) examined a discrete trial training approach that incorporated video stories to teach children with autism to recognize emotions. The authors used a multiple baseline design across behaviors and participants to instruct three 5-year-old boys with autism to label four situation-based emotions (happy, sad, angry, and afraid). Prior to the intervention, the students received 40 h of applied behavior analysis (ABA) training, which did not include emotion recognition training. The setting of the study varied and included the bedroom, living room, or playroom. The researchers conducted generalization probes in natural environments (e.g., the garden or living room). Training sessions occurred 6 days each week and 10 times each day at different periods throughout the day. Sessions were approximately 2–5 min in length. The interventionists included two members of the child’s instructional team who were trained in ABA; individuals who were not part of the child’s instructional team were included during assessments of generalization.

In order to collect baseline data, participants viewed 12 video stories, which starred two puppets. Similar to a discrete trial, interventionists presented one 11-s long video story. Then, the interventionists asked, “How will [character’s name] feel when [situation]?” The interventionists did not signal whether the participant responded correctly or incorrectly. The procedure was repeated for all 12 video stories.

The emotion recognition training included 12 video stories for each of the four emotions, and the emotions were presented in multiple examples to increase generalization. Similar to the baseline procedure, the interventionists would ask the question but would immediately use an echoic prompt (e.g., “Say-happy”) to ensure correct responding. The interventionists systematically faded this prompt based on the participant’s progress. The interventionists recorded whether the child responded or did not respond within 3 s and reinforced correct responses. For incorrect or non-responses, the interventionists would say, “No,” and following a brief pause (i.e., inter-trial interval), another trial that included prompting was implemented to ensure accurate responding. The first emotion, happy, was introduced in isolation. Following two to three repetitions of the happy emotion (i.e., massed trial prompts), the interventionist stopped providing the echoic prompt to see if the child responded independently. The interventionist interspersed previously mastered tasks in subsequent trials if the child responded correctly (i.e., task interspersal). The interventionists introduced sad in the same manner as happy when the child correctly responded with 80 % accuracy in two repeated sessions with previously mastered tasks. The interventionists proceeded to introduce afraid in the same manner until the child successfully identified the emotion. All three emotions were subsequently presented. When the participant labeled each of the three emotions with 80 % accuracy, the interventionists introduced the emotion angry until the child successfully identified it. All four emotions were then randomly presented until the participants identified the emotions with 80 % accuracy.

The authors used novel people, settings, and stimulus items to assess generalization. They also asked questions that did not resemble the previously structured trial questions and utilized other cartoon clips the participants had not seen previously. Four maintenance probes were conducted 15 days following the generalization probes.

McHugh et al. (2011) indicated that students with ASD identified emotions, and generalized and maintained those skills after the treatment. However, McHugh et al. pointed out they failed to collect baseline data on students’ responses with novel people, setting, and stimuli. Therefore, the participants’ abilities to generalize emotion recognition skills are uncertain. The authors also concluded that the study needed to be replicated to include more participants and, most importantly, address ecological validity by examining whether participants would generalize their ability to identify emotions in vivo rather than with video scenarios.

FaceSay

FaceSay, a computer-based social skills program, uses interactive games that include lifelike animal and human avatars, instead of static images. FaceSay includes three games entitled Amazing Gaze, which emphasizes joint attention and eye gaze, Band Aid Clinic, which focuses on facial processing and recognition, and Follow the Leader, which addresses emotion recognition. In Amazing Gaze, an avatar is surrounded by objects, numbers, or faces, and the participants must select the correct item at which the avatar is gazing. Band Aid Clinic distorts a section of the avatar’s face, and participants must select the appropriate section of a face to replace it. In Follow the Leader, participants are first asked to match facial expressions and then asked to manipulate faces to match the facial expressions of the avatar. The program’s avatar “coach” provides instructions for the games and praise for correct responses, and prompts when the participant responds incorrectly.

Using a randomized controlled study and mixed factorial design, Hopkins et al. (2011) examined the effectiveness of FaceSay, the maintenance of the gained skills after the intervention, and the participants’ ability to generalize the skills in natural environments. The authors recruited 49 participants diagnosed with ASD. The participants included 44 boys and 5 girls who were between 6 and 15 years of age. The researchers assessed emotion recognition with Ekman and Friesen’s Unmasking the Face. Hopkins et al. (2011) measured facial recognition skills with the Benton Facial Recognition Test (Benton 1980) and the Social Skills Rating System (SSRS; Gresham and Elliott 1990). Based on scores from the Kaufman Brief Intelligence Test (KBIT; Kaufman and Kaufman 1990), participants were placed into two groups: low functioning autism (LFA) and high functioning autism (HFA). The authors then randomly assigned the participants to treatment and control groups.

The participants in the control group, which consisted of 14 students with HFA and 11 students with LFA, used Tux Paint twice a week for 6 weeks. TuxPaint is a drawing software program and was selected because it did not address the skills being targeted by FaceSay. Each session was approximately 10–25 min. Participants in the experimental group, which included 13 participants with HFA and 11 participants with LFA, used the FaceSay program for the same amount of time as the control group used TuxPaint.

The authors found that the participants with LFA who participated in the FaceSay program improved their emotion recognition and social interactions, and were better able to recognize and label emotions from photographs. Participants with HFA improved in facial recognition, emotion recognition, and social interactions. Participants with HFA generalized their emotion recognition skills to both photographs and drawing and were the only participant able to maintain their emotion recognition skills. Similar to McHugh et al. (2011), Hopkins et al. (2011) stated, “in line with the main difficulty encountered in other intervention programs for this population, is whether children’s improvements were transferred into the child’s more global social competence with peers and family in real settings” (p. 1552). In addressing this, the data suggest that both groups of participants did demonstrate improved social interactions in natural environments following the intervention.

FaceSay is a promising program. Implications for future research, as discussed by the authors, include comparing the program with other interventions, systematically measuring social interaction during naturally occurring opportunities (e.g., duration, frequency, including novel peers). Generalization from computer-based interventions to other natural environments and opportunities continues to need further examination, along with identifying necessary prerequisite skills for the program to be effective (e.g., IQ, autism symptomology, behavior, visual-spatial abilities).

Expanding upon the study conducted by Hopkins et al. (2011), Rice et al. (2015) specifically examined how FaceSay may improve emotion recognition, mentalizing (i.e., ToM), social impairment, and peer interactions. According to Baron-Cohen et al. (1985), mentalizing “refers to the ability to attribute mental states, such as beliefs, thoughts, feelings, plans, and intentions, to oneself and others and to recognize that others’ mental states may be different from one’s own” (as cited in Rice et al. 2015). The authors recruited 31 participants between the ages of 5 and 11 with ASD and identified as HFA. The authors then placed 16 participants in the experimental group and 15 participants in the control group. The control group received SuccessMaker, a computer program targeting reading skills.

The Neuropsychological Assessment Affect Recognition subtest and the Theory of Mind subtest (NEPSY-II; Korkman et al. 2007) were used to assess emotion recognition and mentalizing. The Social Responsiveness Scale, Second edition (SRS-2; Constantino and Gruber 2002), and a behavior coding and rating system modified from one developed by Hauck et al. (1995) were used to examine social impairment and both positive and negative social interactions. Participants used their assigned software program in their school environment for 25 min once a week for 10 weeks. Paraeducators monitored participants’ attention to and interaction with the program and did not provide any additional communication besides referring participants back to the program to maintain on-task.

The results of the study suggested improvements in emotion recognition, mentalizing, and observed changes in social skills following FaceSay. However, there was not a significant difference in positive and negative peer interactions. The results present further support of the effectiveness of FaceSay and cautiously suggest that emotion recognition is linked to the development of social interaction. However, these improvements did not generalize to interactions with peers. The authors attribute the limited generalization to their observational instruments and the complexity of social and communication skills needed, which elaborate on emotion recognition, mentalizing, and face processing, to engage with peers in a more positive manner. In discussing the limitations and implications for future research, Rice et al. (2015) suggested assessing participants beyond the six basic emotions to identify whether participants with HFA are able to recognize more subtle and complex emotions and mental states presented in static photographs. Additionally, the authors suggested dynamic videos be used to assess emotion recognition, as well as addressing generalizability in more natural contexts and situations. Lastly, the authors noted the study should be conducted with a broader population of students with ASD, including various ages and severity levels.

Mind Reading: The Interactive Guide to Emotions

“Mind Reading: The Interactive Guide to Emotions” was developed by Simon Baron-Cohen and Jessica Kingsley. Mind Reading is a computer program aimed at increasing young children’s abilities to recognize emotions and increase their social behavior. The program consists of three modules: the Emotions Library, Learning Center, and Games Zone. The Emotions Library is comprised of over 400 emotions that were reinforced with videos, recordings, images, mini-stories, and other relevant information. The Learning Center includes lessons and quizzes on the emotions, and the Games Zone consists of activities where students must match an emotion with a partially revealed face or match emotions in a “Memory” style game (Junek 2007).

LaCava et al. (2010) used a multiple baseline across participants design to assess the effectiveness of the program. The authors selected four male participants who were diagnosed with ASD, between the 7 and 11 years of age, and who had no intellectual disability. Three subtests from the Cambridge Mindreading Face-Voice Battery for Children (CAM-C; Golan and Baron-Cohen 2008) were used to assess the participants’ abilities to recognize the six basic emotions and nine complex emotions. Monochromatic photographs and cartoon faces, and colored pictures taken from the Ekman and Friesen Pictures of Facial Affect (Ekman and Friesen 1976), the Mind Reading program, and the Teaching Children to Mind Read curriculum were also implemented. The authors identified the number of positive social interactions through observations and measured social validity by having teachers and parents complete a questionnaire.

The participants used the Mind Reading program, with the support of an adult, for 7 to 10 weeks for an average of 12.3 h during the intervention period. Participants were to use the program for 1–2 h each week; however, they were only permitted to use the Game Zone component 33 % of that allotted time. The adults supporting the participants monitored them and ensured that they were using all three components of the program. The adults also engaged the students in discussions to reinforce and attempt to apply the skills to realistic scenarios.

From pre- to post-assessments, all four participants improved in their ability to recognize basic and complex emotions according to the results of the CAM-C subtests and emotion recognitions tasks. There was limited improvement in social interactions found through observations; however, parents and/or teachers anecdotally noted an increase in social interactions in the completion of the social validity questionnaire. The study by LaCava et al. (2010) provided support for the use of the Mind Reading program. However, in discussing the limitations of the study, it would be important to further examine the contradictory observations of both the investigators and the parents and teachers about improvements in social interaction. The authors also mentioned that it was not clear whether Mind Reading or the adult support was more effective in increasing emotion recognition skills, and if a peer tutor may be more beneficial than an adult. As mentioned with previous studies, it is imperative to measure whether participants, following the intervention, were able to further generalize their skills during naturally occurring opportunities.

In another study, Weigner and Depue (2011) used a quasi-experimental research design to also examine the Mind Reading program supplemented with guided lessons. The participants in the treatment group included six participants with ASD, who were between 7 and 11 years of age, were not receiving medication, and were not identified as having an intellectual disability. The participants in the control group included 11 typically developing children who were between 7 and 12 years of age. These participants did not receive training but were tested before and after the intervention period.

Participants in the treatment group used the Mind Reading program for five sessions for 3 weeks. Each session was 30–45 min. In Session 1, parents completed the Autism-Spectrum Quotient (AQ; Baron-Cohen et al. 2001), and the children completed a pretest using the Mind Reading program. The pretest assessed the participants’ abilities to identify 10 emotions, including the six basic emotions. In Session 2 through 4, the participants were instructed through lessons presented on the Mind Reading software. After each lesson, the participants completed 20 practice questions in the Mind Reading program, which asked the children to select video clips or pictures correctly representing the named emotion on the screen. The last session, Session 5, included the posttest. The posttest mirrored the pretest measures and evaluated participants’ abilities to recognize 10 emotions.

Based on the results of the study, Weigner and Depue (2011) stated that the Mind Reading program was effective in increasing the emotion recognitions skills of the participants in the treatment group. After the intervention, the authors found that the treatment group’s posttest scores were comparable to the control group’s scores in emotion recognition. However, it is difficult to determine whether the Mind Reading program or the investigators’ guided lessons affected the participants’ emotion recognition skills. The authors also failed to assess treatment fidelity. In addition, the authors only implemented one assessment to measure emotion recognition; multiple measures would have better captured the proposed effect of the intervention. Weigner and Depue (2011) also acknowledged that generalization and maintenance of skills were not assessed in their study and that the small sample size influenced the study’s external validity.

Lopata et al. (2012) incorporated Mind Reading into a comprehensive school-based intervention (CSBI) for 12 participants with HFA between the ages of 6 and 9 years. Prior to beginning the CSBI, participants completed a 3-week summer preparation program (SPP), which involved four daily, 70-min intervention sessions (including classroom instruction and practice sessions) that targeted social skills, facial and vocal emotion recognition, interpretation of nonliteral language, and interest expansion. School staff also received training on working with students with HFA and on the CSBI.

The CSBI was a 10-month program based on the cognitive-behavioral model and consisted of direct instruction and opportunities for repeated practice in natural contexts. Participants received cognitive instruction, engaged in role-playing scenarios, and were reinforced for engaging in the targeted behaviors. The CSBI also included social skills groups that were conducted three times per week for a total of 60–90 min. The social skills groups utilized the instructional protocol of Skillstreaming (Goldstein et al. 1997) and involved teaching, modeling, role-playing, performance feedback, and transfer of learning. Additionally, therapeutic activities were conducted two times per week, for a total of 40 to 60 min, and involved cooperative activities between a participant and a typically developing peer. A behavior reinforcement system was used to prompt and reinforce participants’ use of targeted behaviors. Lastly, Mind Reading was used three times a week for a total of 60 min/week to teach face and emotion recognition. Parent training was provided once a month for a total of ten 60–90 min and focused on working with individuals with HFA, strategies to promote social and behavioral skills, CSBI procedures and content, and strategies to increase generalization.

The Skillstreaming Knowledge Assessment (SKA; Lopata et al. 2010) was used to assess the effectiveness of the social skills groups by having participants describe appropriate social behavior following the telling of a brief vignette. The CAM-C (Golan and Baron-Cohen 2008) and The Diagnostic Analysis of Nonverbal Accuracy 2 Child Faces and Adult Faces subtests (DANVA2; Nowicki 1997) were used to assess participants’ emotion recognition skills. Parents and teachers were asked to complete a number of measures to evaluate participants’ social skills. These most notably included the Adapted Skillstreaming Checklist (ASC; Lopata et al. 2008), the Social Skills subscale of the Behavior Assessment System for Children, Second Edition (BASC-2; Reynolds and Kamphaus 2004), and the SRS (Constantino and Gruber 2002).

Following the CBSI, participants’ performance of targeted social skills and emotion recognition increased. Both teachers and parents reported that social performance improved and ASD symptomology was minimized. However, there are a number of limitations in the study, including the absence of a control group, a small sample size, and the need for “blinded” observers to more systematically rate participants’ social communication and interaction.

The Transporters

The Transporters was developed by Simon Baron-Cohen in collaboration with the University of Cambridge and the Autism Research Centre (ARC). The Transporters program is based on the theory that individuals with autism prefer systems and have circumscribed interests (Baron-Cohen et al. 2009). In order to appeal to children with ASD, the developers focused on facial expressions and the identification of emotions by grafting human faces onto animated and narrated trains. The Transporters includes eight different types of vehicles to draw the attention of children. The vehicles, which play roles in the episodes, include trams, cable cars, a chain ferry, a coach, a funicular railway, and a tractor (Baron-Cohen et al. 2009). The number of items in the setting of the episodes was minimized to limit distractions and allow children to focus on the trains and faces. The DVD was designed for children between the 3 and 8 years of age and contains 15 episodes (Baron-Cohen et al. 2009). Each episode is 5 min in length and focuses on the 15 basic and complex emotions. In order to increase generalization, the human faces superimposed onto the vehicles are of differing ages, sexes, and ethnicities (Baron-Cohen et al. 2009). The program is reinforced with quizzes and is supported with a Parent User Guide. Following each episode is an easy and difficult level quiz that requires the child to match faces to other faces, match faces to emotions, and match situations to faces (Baron-Cohen et al. 2009). The level of difficulty is manipulated by the number of response options from which students must choose. Parents play a crucial role in supporting the children by allowing them to repeatedly watch episodes to reinforce skills and enhance learning with questions and supported discussions with the children.

Golan et al. (2010) examined the effectiveness of The Transporters DVD in teaching 20 participants with ASD to recognize emotions. The authors included two control groups based on age, sex, and verbal ability. The matched control groups were comprised of 18 children with ASD and 18 typically developing children. All participants were between 4 and 7 years of age. The participants in the intervention group viewed a minimum of three episodes each day for 4 weeks in their home. The researchers did not set a limit on the number of episodes the students could view each day. Golan et al. also encouraged parents to use the guide to further support their child’s training.

As pretest measures, the Childhood Autism Spectrum Test (CAST; Scott et al. 2002) and multiple tests that evaluated participants’ ability to generalize skills in emotion recognition were implemented. The authors first evaluated students’ abilities to generalize gained skills by identifying participants’ emotional vocabulary. In the test, participants were asked to define 16 emotions and provide examples of situations where a person may display each emotion. In an additional three tasks, the authors evaluated students’ abilities to match facial expressions to socio-emotional situations. The tasks included 16 pictures and scenarios that were presented out loud by the investigator. The children were then asked to point to one of three video clips of a character displaying an emotion corresponding to a picture and its scenario. Post-intervention measures were similar to the pretest measures, except that they included different pictures and scenarios, and the parents did not complete the CAST.

After the intervention, Golan et al. (2010) found that children in the treatment group, specifically those with higher cognitive ability, improved and performed similarly to the typically developing children in the control group. Participants in the treatment group showed growth in comprehending and recognizing the 15 emotions targeted in The Transporters program. The participants also identified the emotions in human faces that were not superimposed onto vehicles, which suggested further generalization of the skills. Students with ASD viewing the Transporters DVD improved with the emotional vocabulary tasks and were better able to match facial expressions to socio-emotional situations. Golan et al. also indicated that the efficacy of the program was due to its use of trains, which appeals to children and led them to view human faces more so than with other programs.

In the study, Golan et al. (2010) demonstrated the effect of The Transporters DVD. However, the authors failed to control the amount of parental support and the maximum number of episodes each participant was permitted to view. Interpreting the data is affected because the gains in emotion recognition skills may be attributed to the number of episodes viewed and/or the adult support in addition to The Transporters DVD. The authors implemented the intervention for a short period of time and used a small number of measures to present the full impact of the intervention. As with other studies, it would be important to examine whether participants viewing The Transporters program generalized their skills in other contexts with naturally occurring scenarios.

In another study on The Transporters program, Young and Posselt (2012) aimed to determine if gains in emotion recognition skills correlated with cognitive ability, amount of time spent viewing The Transporters program, and the amount of parental support. The authors recruited 25 children between 4 and 8 years of age who were diagnosed with autism. The authors also used the Wechsler Preschool and Primary Scale of Intelligence, Third Edition (WPPSI-III; Wechsler 2002), or the Wechsler Intelligence Scale for Children (WISC-IV; Wechsler 2003) to assess the participants. The authors used the Block Design, and Receptive Vocabulary and Comprehension subtest of the Wechsler Scale to determine non-verbal ability. The authors determined participants’ abilities to recognize emotions with the Affect Recognition subtest of the NEPSY-II and the Face Task. Young and Posselt used the Face Task to assess participants’ abilities to select, from two options, the appropriate word to describe each of the 20 photographs of faces displaying basic and complex emotions. The authors used the aforementioned assessments in order to collect pretest data.

The authors randomly assigned the participants into a control group and an intervention group. Participants in the control group watched the Thomas the Tank Engine episodes that only focused on emotion. Participants in the intervention group viewed The Transporters program. Young and Posselt (2012) had all participants watch a minimum of three episodes for 3 weeks in their home setting and had the parents record the number of episodes their child viewed each day. Parents were also encouraged to use the guide to further support their child’s learning.

Following the intervention, the participants were assessed with the Social Communication Questionnaire (SCQ; Rutter et al. 2003), the NEPSY-II subtests, and the Face Task. When compared to the control group, participants in the treatment group showed significant improvements in identifying and labeling basic and complex emotions. The researchers determined participants’ improvement in social skills by evaluating social peer interest and eye contact. The participants in both treatment and control groups showed improvements in social skills behavior. Young and Posselt (2012) also found that there was no correlation between participants’ cognitive ability and progress in identifying emotions. The authors believed that the finding supported the use of The Transporters program with not only individuals with high IQ but also individuals with lower IQ scores. However, the correlation between progress in recognizing emotions and the amount of time spent viewing The Transporters program, and the correlation between progress in recognizing emotions and the amount of parental support was inconclusive. In addition, since the authors’ only measured correlations, neither the amount of time spent viewing The Transporters program nor the amount of parental support could be identified as causing the progress in recognizing emotions. Notably, the authors suggested that IQ is not a predictor of gains in emotion recognition and that there may not be a connection between improvements in emotion recognition and improvements in social interaction. The authors emphasize the need for continued research, the need to focus on generalization in natural contexts, and the maintenance of skills.

In another published study on The Transporters DVD, Williams et al. (2012) examined whether students generalized skills in emotion recognition to ToM tasks and situations requiring social skills and whether participants maintained the skills after 3 months. The participants included 55 children diagnosed with autistic disorder by the DSM-IV. The children were between 4 and 7 years of age and had completed the Autism Diagnostic Observation Schedule (ADOS; Lord et al. 1999) and WPPSI-III (Wechsler 2002).

The authors collected pretest data using the Vineland-II Socialization questionnaire (Sparrow et al. 2005) and ADOS (Lord et al. 1999). Williams et al. (2012) determined participants’ emotion recognition skills with matching tasks using Ekman and Friesen’s Pictures of Facial Affect (Ekman and Friesen 1976), WPPSI-III (Wechsler 2002), and the NEPSY-II (Korkman et al. 2007). The NEPSY-II tasks measured affect recognition and ToM. The researchers randomly assigned the participants to a treatment group or a control group. The investigators administering the assessments were blind to the assignment of participants. Children in the control group viewed the Thomas the Tank Engine television series, and the children in the treatment group viewed The Transporters program for at least 15 min each day for 4 weeks. The parents recorded the number of hours their child viewed the program and also provided support with the use of the guide. The authors collected post-assessment data after 1 week and then again after 3 months. Williams et al. also conducted maintenance probes for 46/55 participants.

When compared to the control group, Williams et al. (2012) found that the children receiving The Transporters intervention showed improvements in identifying and matching the emotion anger when using the Pictures of Facial Affect (Ekman and Friesen 1976). Participants in the treatment group did not improve with the NEPSY-II tasks or the Vineland-II Socialization questionnaire (Sparrow et al. 2005). With the maintenance probes, Williams et al. found that children receiving The Transporters intervention were better able to identify happiness but were unable to maintain their ability to identify anger. Following the intervention, data showed that children made minimal gains in generalizing the skills to ToM tasks and social situations.

Contradicting past research studies affirming The Transporters program, Williams et al. (2012) stated that there was little evidence that supported the effectiveness of The Transporters program in training children with ASD to better recognize emotions. Williams et al. suggested that failure to replicate the positive results of prior studies may be due to the inclusion of students with more severe cognitive disabilities. The authors’ inclusion of participants with LFA, however, was also a limitation of their study. The authors could not collect assessment data from all participants with LFA, and the assessments measuring their ability to recognize more complex emotions were not completed. Other limitations included the failure to control variables, such as parental support and amount of time spent using the program.

MiX

Russo-Ponsaran et al. (2016) modified the MiX by Humintell©, which targets emotion recognition by adding coach assistance, combining a didactic instruction for seven basic emotions, and scaffolding instruction, which included repeated practice with increased presentation speeds, guided attention to relevant facial cues, and imitation of expressions. The investigators recruited 25 participants with ASD between the ages of 8 and 15 years, and 12 participants were block randomized in the intervention group and 13 were placed in the waitlist control group.

Training occurred for 45–60 min twice a week for an average of six sessions. “Each training session followed the same format and consisted of didactic instruction, imitation exercises, repeated practice, and in-training competency testing” (Russo-Ponsaran et al. 2016, p. 23). A video describing key facial features associated with one emotion was played two times. With the second viewing, the coach would use a screen cover to pinpoint the key facial features as they were being described. The participants were then asked to imitate that facial feature on the computer screen through the web camera. This was repeated until all key facial features were imitated. Then, the participant was asked to practice the entire facial expression (i.e., combining all facial features). A practice test of 42 items then followed.

Outcome measures evaluated emotion recognition, self-expression, and generalization. Specifically, emotion recognition instruments included the MiX competency post-test, two subtests of the Comprehensive Affective Testing System (CATS; Weiner et al. 2006), the Diagnostic Analysis of Nonverbal Accuracy Child Faces test (DANVA; Nowicki and Duke 1994), and the NEPSY-II: Affect Recognition Subtest (Korkman et al. 2007). Videos of participants were used to directly assess self-expression. Generalization of emotional awareness (i.e., vocabulary, comprehension, self-assessment) was evaluated through Emotion Fluency, a test developed by the authors and based on the Clinical Evaluation of Language Fundamentals, Fourth Edition (CELF-4; Semel et al. 2003), and Emotion Storybook, a test developed by the authors and based on the storytelling task from the ADOS (Lord et al. 1999). The Child and Adolescent Social Perception Scale (CASP; Magill-Evans et al. 1995) and Bar-On Emotional Quotient Inventory: youth version (BarOn EQI:YV; Bar-On and Parker 2000) were also used to assess social emotional awareness and participant’s perceptions of their own social functioning and understanding. Generalization was further assessed with parent and teacher questionnaires.

Participants in the intervention group demonstrated an increased ability to recognize emotions based on the direct assessments. However, generalization of emotional awareness (i.e., vocabulary, comprehension, self-assessment) were limited, and participants were only minimally able to transfer such skills to more complex social scenarios as demonstrated with the CASP (Magill-Evans et al. 1995). Implications for future research, as discussed by Russo-Ponsaran et al. (2016), include an increased sample size and ensuring maintenance of gained emotion recognition skills. Furthermore, examining participant performance in more naturalistic settings as a way to demonstrate generalization of skills would be imperative in future research.

Discussion

The purpose of this literature review was to investigate the research on programs that used technology to target emotion recognition skills for individuals with an ASD. The studies suggested that the respective emotion recognition training programs they investigated were promising in increasing such skills and potentially impacting other social skills deficits in the population. However, as suggested by the authors of the selected studies, continued research remains critical in addressing methodological limitations, generalization of skills in realistic contexts and situations that extend beyond technology, and further expanding our understanding of emotion recognition interventions.

Diagnostic features or symptomology of participants is important in identifying possible prerequisite skills that may be necessary for the intervention to be effective. A majority of studies only included participants specifically identified with HFA or were identified to have no intellectual disability (LaCava et al. 2010; Lopata et al. 2012; Rice et al. 2015; Weigner and Depue 2011). Rice et al. (2015) identified participants with a full scale IQ > 70 as HFA, and Russo-Ponsaran et al. (2016) specified that individuals were eligible if they had a full scale IQ ≥ 80 and were verbal. Hopkins et al. (2011) included both participants with HFA (KBIT score greater than 70) and LFA (KBIT score less than 70). Lopata et al. (2012) included participants with an IQ > 70 and Verbal Comprehension Index or Perceptual Reasoning Index score ≥80 on the WISC-IV (Wechsler 2003), as well as receptive or expressive language score ≥75 on a short-form of the Comprehensive Assessment of Spoken Language (CASL; Carrow-Woolfolk 1999). The average score for participants in the study conducted by Golan et al. (2010) was 24.0 according to the CAST (Scott et al. 2002), and the average autism severity score of 18.38 according to the SCQ (Rutter et al. 2003). The study conducted by Williams et al. (2012) included participants with an average full scale IQ of 77.93 and an average score of 6.79 according to the ADOS (Lord et al. 1999). The resulting implications suggest that participants with HFA with no or minimal cognitive disability are more successful in gaining emotion recognition skills following the intervention (Golan et al. 2010; Hopkins et al. 2011; LaCava et al. 2010; Lopata et al. 2012; Rice et al. 2015; Weigner and Depue 2011; Williams et al. 2012). However, Young and Posselt found there to be no correlation between cognitive ability and emotion recognition when examining the effectiveness of The Transporters. Given the majority of the studies that included participants with HFA, it would be imperative to examine whether such interventions may be effective for participants with LFA.

The authors in several of the studies supplemented the emotion recognition program with additional and varying degrees of adult support (Golan et al. 2010; LaCava et al. 2010; Williams et al. 2012; Young and Posselt 2012), or lesson instruction (Weigner and Depue 2011). McHugh et al. (2011) included adult prompts and Lopata et al. (2012) applied a CSBI, which consisted of multiple components such as Mind Reading, direct instruction, practice opportunities, therapeutic activities, and social skills groups. Consequently, determining which interventions or intervention components were most successful in increasing emotion recognition skills is uncertain and requires more research.

Another issue that may also have impacted the effectiveness of the training programs was the amount of time that participants were exposed to the intervention. In the study conducted by McHugh et al. (2011), participants were exposed to the intervention for ten 2–5-min sessions, 6 days a week until reaching mastery. Participants engaged in the FaceSay program for 10–25 min/week for 6 weeks in the study conducted by Hopkins et al. (2011) and for 10 weeks for Rice et al. (2015); however, both studies yielded positive effects. LaCava et al. (2010) had participants use Mind Reading for 1–2 h/week for 7–10 weeks, and Weigner and Depue required participants to use the program for five 30–45-min sessions for 3 weeks. Participants from both studies were successful in increasing their ability to recognize emotions. The participates in Lopata et al.’s (2012) study took part in a 3-week summer preparation program, which was followed by four daily 70-min sessions for 10 months. Golan et al. (2010), Young and Posselt (2012), and Williams et al. (2012) had participants view a minimum of three episodes of The Transporters for 3–4 weeks. The results from these studies varied, which Williams et al. suggested may be due to the autism severity levels of the participants. MiX was used by participants for twice a week 40–60 min across an average of six sessions. Future research related to the duration and frequency of intervention sessions and whether or not they provide differential results remains necessary.

Methodological issues discussed by researchers included increasing sample sizes, ensuring control groups, using blinded observers, and utilizing more systematic measures to identify participants’ social interactions, and future replications. Other reoccurring limitations and implications for future research discussed by investigators of all of the included studies included maintenance and generalization of gained emotion recognition skills. Although not substantially addressed in the studies reviewed in this paper, opportunities to generalize gained skills across novel settings, situations, and people are imperative in emotion recognition training. Students with ASD must be able to apply emotion recognition skills to situations requiring social interaction and communication, which are fundamental deficits in this population. Kandalaft et al. (2013) stated, “This lack of real-world training may hinder the generalization of treatment effects. Less is known about the potential change in social skills and social cognition when conducted in an individual real-time simulation of authentic social interactions” (p. 35). Consequently, providing increased opportunities to practice emotion recognition in applicable scenarios is recommended when teaching individuals with autism. In discussion ecological validity, McHugh et al. (2011), LaCava et al. (2010), Weigner and Depue (2011), Golan et al. (2010), Young and Posselt (2012), and Russo-Ponsaran et al. (2016) emphasized the need to implement interventions in naturally contexts and situations. Furthermore, Hopkins et al. (2011) and Rice et al. (2015) emphasized the importance of participants utilizing and expanding upon emotion recognition skills to engage in more complex social interactions with peers.

The studies addressed in this literature review offer a foundation for future research on interventions targeting emotion recognitions skills. Emotion recognition is critical in the development of a child, especially a child diagnosed with ASD. Identifying and improving upon emotion recognition interventions will be crucial in ameliorating the socio-emotional and communication deficits that are characteristic of this heterogeneous and continuously growing population.