Introduction

Autism spectrum disorder is characterized by deficits in social communication and restricted, repetitive behaviors (American Psychiatric Association 2013). Many individuals with ASD experience social isolation that negatively influences all aspects of development as well as long-term outcomes (Muller et al. 2009). In particular, limitations in social skills can restrict an individual’s ability to make and maintain relationships, obtain and retain employment, live independently, and fully participate in society at large (McConnell 2002; White et al. 2007). One important outcome that is documented to be particularly at risk in individuals with ASD is making and maintaining authentic, reciprocal friendships (Petrina et al. 2014; Rowley et al. 2012; Solish et al. 2010). Friendship is one of the most fundamental aspects of quality of life, and its benefits are well documented (Parker and Gottman 1989). Many children with disabilities, including children with ASD, experience substantial difficulty in making and maintaining friends. Moreover, children with disabilities like ASD are often perceived as less socially competent and of lower social status than their typically developing peers (Orsmond et al. 2004).

Historically, efforts to promote positive social outcomes for individuals with ASD have involved either social skills training, or peer buddy programs that encourage potential partners to interact with the individual with ASD. Extensive research has shown children with ASD can indeed learn these targeted social skills (e.g., Odom et al. 1994; McConnell 2002; Rao et al. 2008). Despite this, outcome data suggest, quite convincingly, that successful completion of social skills training does not translate to increases in the number, quality, or duration of friendships for children with ASD (Petrina et al. 2014; Finke 2016). On the basis of these continued poor outcomes, Finke (2016) proposed a theoretical reconceptualization of the fundamental elements that contribute to the formation and maintenance of authentic friendships in adolescents with and without ASD. Finke (2016) argued successful friendship-based outcomes are dependent on presence of at least three elements: (a) offering a means by which both partners can enjoy equal status/contribution to the relationship (Newcomb and Bagwell 1995); (b) structuring intervention to include mutually motivating opportunities for interaction; and (c) offering frequent opportunities for interaction in an activity context preferred by both partners.

Finke (2016) also offered several potential contexts that may meet these criteria as examples and potential avenues for researchers to begin to explore her ideas empirically. One such context is videogame play. Young adults are particularly likely to play video games, as well as to identify as “gamers.” Lenhart et al. (2015) reported two-thirds (67%) of those ages 18–29 play video games, while 22% report the term “gamer” describes them well. The Entertainment Software Association (ESA 2015) reported the gender breakdown of American videogame players to be 56% male and 44% female. The ESA also reported that 42% of Americans play videogames more than 3 h per week. In addition to being popular, it has been reported that nearly 40% of people who play videogames three or more hours per week (i.e., the most frequent videogame players) play social games and spend up to 6 h per week playing videogames with others (either online or in person). Fifty-four percent of the most frequent videogame players reported feeling videogames help them connect with friends (ESA 2015).

Mazurek and Wenstrup (2013) reported the typically developing children between the ages of 8 and 12 in their research sample played videogames approximately 11 h per week, while Finke et al. (2015) reported children with ASD played videogames approximately 12 h per week. These data appear to suggest children with ASD engage with videogames at comparable rates to children without ASD. Further, per parent report, children with ASD commonly play videogames that are among the most popular with children without ASD (Finke et al. 2015). Further, children and adolescents with ASD have indicated they spend the majority of their time with their friends playing videogames (Bauminger and Kasari 2000; Bauminger and Shulman 2003; Kuo et al. 2013). Similarly, adolescents without disabilities frequently use videogame play as a context for interacting with their friends as well (Olson 2010).

Prior to any investigation of the proposition that videogame play might offer a route toward establishing friendships, an understanding of how individuals with ASD attend to and engage with these games is needed. If visual attention patterns of individuals with ASD to characters or other aspects within the videogame are substantively different that those of peers without disabilities, it will be critical to identify and accommodate these differences before initiating an intervention.

Measuring Visual Attention to Videogames as a Necessary First Step

While videogames might have promise as a context for friendship intervention, the behavioral profile associated with ASD introduces several specific and unique issues that warrant examination prior to embarking on any intervention effort using videogames as the context. One of the core features contributing to diagnosis of ASD is atypical patterns of eye gaze during actual social interactions, such as gaze aversion or limited eye contact with social partners (APA 2013). Videogames are platforms within which videogame characters (the avatars within the videogame) enact the action and, oftentimes, interact with one another. If the visual attention atypicalities of individuals with ASD extend to attention paid to characters within videogames, then the play patterns of individuals with ASD might be quite different than those of their peers. In that event, any friendship intervention based on videogame play might be destined to fail, as it might not accommodate potential barriers of differential visual attention of players with and without ASD.

Atypicality in visual processing in individuals with ASD has formed the basis for some frequently cited models of ASD (e.g., Happe and Frith 2006). Describing the nature of visual attention deficits in individuals with ASD has been the focus of much research (see Ames and Fletcher-Watson 2010). Automated eye tracking technology is gaining popularity as a means of extending what is known about visual processing in social contexts, as the data collected can reveal the features that capture visual attention, a momentary marker of underlying cognitive and social processing (Gillespie-Smith and Fletcher-Watson 2014). As a result there has been a steep increase in the number of eye tracking studies examining the visual social attention patterns and visual responses of individuals with ASD. Visual social attention “refers to the overt attentional bias to orient to and look at other people, notably their face and eyes, as well as to where they direct their attention” (Guillon et al. 2014, pp. 280).

One of the proposed core deficits experienced by individuals with ASD is decreased attention to socially relevant stimuli, particularly faces, when compared to individuals without ASD (e.g., Dawson et al. 2005; Sasson 2006). Results of eye tracking studies examining this core deficit in children, adolescents, and adults with ASD have been mixed. Some studies have shown a clear difference in the visual attention patterns between individuals with and without ASD (e.g., Chawarska et al. 2013; Klin et al. 2009; Riby and Hancock 2009), while other studies suggest between-group performance is indistinguishable (e.g., Chawarska et al. 2012; Sasson et al. 2011; Fletcher-Watson et al. 2009). The differences in these outcomes may be related to differences in the stimuli used (static images vs dynamic videos) as well as the social demands presented within the task. To date, studies that used static images containing a single person within the image as stimuli garnered visual social attention patterns from individuals with and without ASD that were similar (e.g., Fletcher-Watson et al. 2009). Other studies have examined visual social attention using dynamic stimuli containing more than one person. In these contexts, individuals with typical development demonstrated increased quantity of time fixated on faces and eyes in these situations compared to individuals with ASD (e.g., Riby and Hancock 2009; Nakano et al. 2010). These results appear to suggest that as a stimulus becomes more socially complex (e.g., includes more than one person), the visual social attention patterns of individuals with ASD become more divergent compared to participants without ASD.

There is also evidence that, for individuals with ASD, relevance or salience of competing objects or events may affect their attention to faces (Sasson and Touchstone 2013). This may vary, however, as contexts and stimuli become more naturalistic (Guillion et al. 2014). Overall, current eye tracking research of visual social attention in individuals with ASD appears to suggest individuals with ASD have most difficulty orienting to faces when there are competing non-social objects in the visual field, and that individuals with ASD take longer to fixate on faces when the stimuli and display are complex (Guillion et al. 2014). One major limitation of the eye tracking research involving participants with ASD to date, however, is that, with few exceptions (e.g., Gillespie-Smith and Fletcher-Watson 2014), eye tracking technologies have rarely been used with individuals with ASD who have concomitant intellectual disabilities. This gap in the literature needs to be addressed as eyetracking technology has the power to reveal information about visual attention that is difficult or impossible to measure behaviorally, particularly in individuals with disabilities who cannot respond to conventional methods of assessment (Wilkinson and Mitchell 2014).

Purpose

The current study used automated eyetracking technology, to investigate visual attention to a videogame stimulus. We examined visual attention to videogame stimuli because videogames appear to offer a uniquely well-suited environment for the fostering of friendships, but it is not yet known if children with ASD attend to and play videogames like children without ASD. We asked one primary and one secondary research question. The primary research question concerned attention allocation by individuals with and without ASD to a real-time video stream of a person playing a videogame (a young woman), displayed as an inset in a corner of the videogame display. Would individuals with ASD, who reportedly show reduced social referencing and gaze avoidance to complex social stimuli, show similar limitations in attention to the videogame player, and her reactions, in the context of watching videogame play?

The secondary question concerned how participants with and without ASD allocated their attention to the important/meaningful elements of the videogame. Specifically, we characterized whether individuals with ASD attend to the same elements of the videogame itself (the avatar, meaningful elements such as the “life” indicator, that is, the graphic indicating the health of the character in the game) as peers without ASD. This information might inform future interventions, as selective differences in attention by individuals with ASD might influence their videogame play strategies and, therefore, their possible success in participating in cooperative videogame play with a peer.

Method

Participants

Participants were 11 individuals with ASD and 8 with typical development. Participants with ASD were recruited from a local non-public school for children with ASD. Participants without ASD were recruited through personal contacts. Only children whose parents provided signed informed consent, who also provided their own assent to participate, and who met the inclusion criteria participated in the study. Inclusion criteria for the participants with ASD were: (1) having a documented diagnosis of an autism spectrum disorder from a medical professional, as recorded on school records, (2) being between the ages of 6 and 21, inclusive, (3) having parental (or primary caregiver) permission to participate in the investigation, and (4) providing their own assent to participate. Inclusion criteria for the participants with typical development were: (1) having no reported and/or documented history of any type of disability, (2) having parental (or primary caregiver) permission to participate in the investigation, and (3) giving their own assent to participate (if under the age of 18) or written consent to participate (if over the age of 18).

Participants with ASD ranged in age from 8; 11 to 17; 10 years. All but one had moderate to severe limitations in receptive vocabulary, as indicated by standard scores in the range of 20–63 on the Peabody Picture Vocabulary Test-IV (PPVT-IV; Dunn and Dunn 2007); one individual with ASD had a receptive vocabulary score within normal limits. We opted to retain this individual in the sample because visual inspection of the data indicated his eye gaze patterns were no different from the others with ASD, that is, his measures were not outliers, but rather were consistently within the range of measures. All of the participants with ASD had significant expressive limitations documented via school-based testing in their school and academic records; and used various forms of augmentative and alternative communication (AAC) as their primary form of expressive communication. See Table 1 for demographic information for the participants with ASD.

Table 1 ASD demographics

Participants without ASD ranged in age from 11; 11 to 20;1 years. All had receptive vocabulary scores at or above normal ranges, as indicated by standard scores in the range of 100–135 on the Peabody Picture Vocabulary Test-IV (PPVT-IV; Dunn and Dunn 2007. See Table 2 for additional demographic information for the typically developing participants. Matching across the groups of participants was based loosely on chronological age. Chronological age matching was attempted because the study aimed to inform future interventions that might use videogame play as a context for interventions targeting friendships among same-aged peers. Therefore, it was important to determine if the visual attention of adolescents with autism to the videogames was similar or different from the attention of the peers with whom they might be playing, and making friends. There was at least one participant without ASD for every age (in years) of the participants with ASD.

Table 2 TD participant demographics

Materials and Stimuli Development

One videogame play clip involving the LEGO Marvel Superheroes videogame was captured using a Microsoft Xbox One. This picture-in-picture video clip displayed the videogame player’s face and the videogame play and was captured using the Twitch application. Twitch allows videogame players to broadcast their videogame play sessions using a picture-in-picture format. The viewer of the Twitch stream is able to see both the videogame play and the face of the videogame player. The Twitch stream of the videogame player for the current project was recorded using the screen recording feature in Apple QuickTime (see Fig. 1). The QuickTime recording was then saved and uploaded into the eyetracking software for data collection.

Fig. 1
figure 1

Twitch feed capture

General Procedure

Each participant engaged in one eye tracking data collection session. During this session the participant was calibrated with the eye tracking equipment using a two-point gaze fixation procedure in which the participant was directed to look to the top left and then the bottom right corner of the screen by the presence of a familiar cartoon character. Once proper calibration was achieved, each participant watched the prerecorded videogame play clip that contained the picture-in-picture video inset of the player’s reactions to the videogame. There were no additional sessions and repeated viewing of the video was not allowed.

The Tobii T60 and the software Tobii Eye Tracking Studio tracked participants’ eye movements. The Tobii T60 captures the movements via infrared light that is projected from the top strip of the monitor. The infrared light bounces off of the participant’s eyeball and these reflections are recorded by detectors along the bottom strip of the monitor. Using the participant’s distance from the monitor, the curvature of the cornea, and the location of the pupil, the system derives coordinates for the gaze location at each sample taken. The Tobii captures six samples of eye position per second (1 every 16 ms). The Tobii was connected to a Dell laptop where all of the data were stored within the Tobii software on the computer.

Data Collection Setting

All data were collected in a small room designated for the research activity. The room contained a table, the Tobii machine, the Dell laptop, and chairs. Participants were scheduled for 15 min sessions, which was sufficient for watching the clips and for transition to the data collection room and back to the classroom. Each participant sat in a chair 65 cm from the Tobii T60 screen located on top of the table. The researchers sat to the side of the participant in front of the Dell computer in order to start the video, as well as observe and monitor the participant during the viewing task. For the participants with ASD, the data collection session occurred at their school during school hours. For the participants without ASD, data were collected in a laboratory setting on a university campus.

Data Preparation

The Tobii software program was used to create “areas of interest,” that is, to enclose the areas on the screen in which actions or events meaningful to the videogame play. Figure 2 illustrates one frame from the videogame, with the areas of interest (AOIs) illustrated. The primary research question in this study concerned whether or not participants with ASD referenced the real-time video of the videogame player during the videogame play. Therefore, one main area AOI was the square enclosing the picture-in-picture video stream in the lower right corner of the screen (labeled “face” in Fig. 2). For the secondary research question, other AOIs related to other meaningful events were evaluated. These included Action Scene (the actual action being engaged in by the Lego character being controlled by the player), Big A (a large “A” shape that the character had to climb during the videogame), Dialog (the written set of instructions that scrolled along the bottom of the screen), and Life (the indicator of how much life the character had remaining). All other areas in the videogame that were not enclosed in the defined AOIs were treated as “Other”.

Fig. 2
figure 2

Areas of interest

Dependent Measures

The dependent measure was the proportion of each participant’s own total fixation time allocated to each of the defined AOIs, and to “other”. For each participant we first derived the total time spent fixated anywhere on the videogame, and then calculated the proportion of that amount of time spent fixating on each of the AOIs. Fixations were defined as gaze that dwelled within a 35-pixel area for at least 100 ms (six or more samples), using the default filter settings on the Tobii Studio software. Data were analyzed for those participants who were calibrated and for whom more than 33% of the samples were obtained over the course of the session (one participant with ASD was excluded). Of the 13 children with ASD who participated in the eye tracking data collection sessions, usable data were collected and analyzed from 11 of them.

The proportion of time allocated to each AOI was calculated based on each participant’s own time spent fixated anywhere on the screen, rather than the total possible viewing time. The clip was 160.38 s in length, however, no participant showed fixations for that entire period; gaps in fixation can occur for any number of reasons, including saccades, blinks, looks away from the screen, or presence of stereotypic behaviors that momentarily occlude the infrared recording (e.g., bringing a hand to the face). Previous eyetracking research with individuals with ASD and concomitant intellectual disability indicated that the overall amount of time spent fixated, anywhere on the screen, can be significantly less for individuals with ASD than for those with typical development (Wilkinson and Light 2014). In the current study, the individuals with ASD spent a mean of 120.12 s fixated on the screen, while the matched peers spent a mean of 153.78 s fixated on the screen. Independent samples Mann–Whitney U statistic confirmed this difference was of statistical significance (U = 70, p = 0.033). Logic dictates an individual with a lower overall fixation time will also spend shorter times fixated on any given element, thus potentially compromising comparison across individuals if the absolute time values are used. Proportion of attention allocation was therefore calculated to reflect the allocation of attention to different elements relative to each individual’s own total fixation time.

Results

The primary research question concerned visual attention to the player in the picture-in-picture live-stream. The secondary research question concerned visual attention to other meaningful elements in the videogame, including the action, the Big A, the life points, and the dialog box, as well as the remaining “other” areas. Table 3 presents the mean percent of time spent by each group on each area. Because of the sample sizes, nonparametric statistical analysis (independent-samples Mann-Whitney U test) were used to determine if the attention allocation was similar or different across groups. The groups did not differ on percentage of time spent allocated to either the AOIs or to the remainder of the videogame, that is, the area coded as “other” (nonparametric Mann-Whitney values ranging from 60 to 54, with p values from 0.206 to 0.545). Visual attention to the dialog box fell just shy of statistical significance with the nonparametric analysis, U = 68, p = 0.051 (these p values are without adjustment for multiple comparisons).

Table 3 Mean percent of time fixated on each AOI for each group

To illustrate the characteristics of the data subjected to analysis, Fig. 3 presents a box plot of the percentages for all AOIs except “life”; this AOI is not depicted because it received such limited fixation from both groups that the box plots contained zeroes. The box plot illustrates the range of percentages (vertical lines), the middle 50th percentile of percentages (the boxes) and the median percentage (horizontal lines within boxes). Although there is greater variability in the group with ASD, the figure illustrates the findings reported at the summary level, that is, little other difference in gaze allocation between groups.

Fig. 3
figure 3

Box plot of range, median, and mid 50th percentile for fixations on AOI by group

Discussion

The results of the current study indicated that during a passive viewing task, the participants with and without ASD visually attended to the videogame stimulus (160.38 s in length) similarly, with the possible exception of the written dialog box that scrolled along the bottom. The data also indicated the participants with ASD referenced the face of the videogame player with equal duration of fixation as their peers without ASD. These results have implications for the future design of interventions using videogames as a context for promoting friendships between children with and without ASD, for the type of videogame chosen for use in these interventions, as well as for the individuals with ASD who may participate in these interventions.

Implications of Results

Friendship Interventions

The overall similarity in the results between the groups is very promising for the use of videogames as a context for friendship-based interventions and interactions. Overall, the participants with and without ASD attended to the same features of the videogame with similar levels of intensity, suggesting the two groups are following the actions of the characters and the progression of the videogame story similarly. Clearly, one next step is to evaluate whether the two groups also play the videogames similarly.

If videogames could be used as a context for interventions targeting friendship-based outcomes, a world of new possibilities may be explored and opened up for individuals with ASD as well as their typically developing peers. It appears the results of the current investigation indicate the patterns of visual attention to a videogame stimulus, in this passive viewing activity, are similar, and that children with ASD attended to core features within the videogame, for the most part, in ways that are the same to their peers without ASD.

Choice of Videogame

The implications of the finding related to weak but possible differences in visual attention to dialogue could have important implications on the choice of videogame used in interventions with children with ASD. Text is frequently used within videogames to direct players’ actions, to provide information regarding how to move through the game and make progress. The differences between groups for this variable could be related to differences in literacy skill between the two groups. In an actual videogame play situation or intervention, this discrepancy between the groups could imply a need for careful consideration of the participants’ familiarity with the videogame, or the need for the literate partner to read the text to the nonliterate partner.

Individuals with ASD

Unusual visual attention patterns, most particularly to human figures and faces, is a core feature of the ASD diagnosis. It is therefore interesting that in the current study, children with ASD visually attended to most features of the videogame similarly to their nondisabled peers. Other important implications for individuals with ASD are related to the complexity of the videogame stimuli used in the current investigation, the chronological age matching procedures used for the comparison group, and the contrasting findings to previous eye tracking research involving participants with ASD.

First, the videogame stimuli used for data collection in the current study were socially complex. The similarity of the findings in the picture-in-picture condition indicate that, in this situation, individuals with ASD socially oriented and alternated their gaze between a human face and the videogame play scene similarly to their non-ASD peers. This is an interesting finding as the face in the picture-in-picture condition was a human, and not the human-like LEGO figures. This may suggest something unique about the way individuals with ASD process and participate with people during videogame play compared to how they process and attend to human faces and figures in other contexts (Gillespie-Smith and Fletcher-Watson 2014).

Second, matching based on chronological age rather than other factors (e.g., cognitive status, language status) is important because friendships develop between peers, generally people who are chronologically similar. This increases the social validity of the results and suggests that similar patterns of visual attention might be apparent in natural contexts. Third, the findings from the current investigation largely contrast with previous findings from eye tracking research involving individuals with ASD. Previous research has suggested individuals with ASD demonstrate decreased attention to socially relevant stimuli, particularly faces, when compared to individuals without ASD (e.g., Dawson et al. 2005; Sasson 2006). To date, the majority of the research demonstrating similar visual attention patterns between individuals with and without ASD have used static images containing a single person within the image as stimuli (e.g., Fletcher-Watson et al. 2009). The results of the current investigation add to this research in a meaningful way because the stimulus was dynamic and complex. This contrasts with previous research results that suggested that as a stimulus becomes more socially complex (e.g., includes more than one person), the visual social attention patterns of individuals with ASD become more divergent compared to participants without ASD (Guillion et al. 2014).

Limitations

The current study has several recognized limitations. This study was limited to 19 individuals, 11 with ASD and 8 typically developing. This study only recruited participants with ASD from one school in central Pennsylvania, all of whom had very similar demographic profiles, which may limit generalization. Only individuals between the ages of 8 and 17 participated, which did not permit the study of young children, or older adults. The participants only watched one videogame (LEGO Marvel Superheroes) clip, not allowing examination of visual attention patterns for other types of videogames, or other types of clips. All participants sat in a chair facing the Tobii T60 screen located on top of the table while researchers sat to the side of the participant to reduce distractions. And lastly, only participants for whom the Tobii system had captured 33 % or more of their fixations were used for the data analysis, removing some individuals from the study.

Future Research

The results of the current study suggest individuals with ASD visually attend to a videogame stimulus similarly to typically developing individuals during passive viewing of a videogame play clip. Future research should investigate whether these patterns of visual attention similarity hold when passively viewing other types of videogame clips (e.g., traditional videogame play and scenes where characters interact with each other the move the narrative of the videogame forward). This will determine if the findings from the current project are robust and can be replicated with other types of videogames and videogame stimuli.

Future research should also determine whether individuals with and without ASD play videogames similarly, alone and with a play partner. Attending to, and playing videogames are two different tasks and need to be studied separately. A longitudinal study that observes how children with ASD interact and converse while playing videogames together would provide insight into the nature of friendship formation and maintenance, and will provide evidence of the utility of using videogame play as a context for making and maintaining authentic friendships.