Introduction

Social Skills Training Programs (SSTPs) are a treatment approach most commonly used with older, higher-functioning children and adolescents with Autism Spectrum Disorder (ASD). In SSTPs, a therapist meets regularly with a small group of children with ASD to teach and discuss social skills, such as those involved in having conversations or demonstrating empathy (e.g., Laugeson et al. 2009; Solomon et al. 2004). While several promising SSTPs have been developed, “social skills” is a complex construct, and it can be difficult to adequately assess improvements in social skills (Koenig et al. 2009; White et al. 2007). More comprehensive and accurate measurement of the social skills construct is critical for evaluating the efficacy of SSTPs (McMahon et al. in press).

Both social knowledge (i.e., whether a child knows a social skill cognitively) and social performance (i.e., whether a child applies that social skill to everyday life) are important skill sets to consider when evaluating the efficacy of SSTPs (Gresham 1997; Lerner et al. in press; McMahon et al. in press). An assessment of social performance provides an indirect assessment of social knowledge, as knowing a social skill cognitively is often a prerequisite for applying that social skill appropriately. However, an assessment of social knowledge does not provide an assessment of social performance; inattention, poor impulse control, and other participant characteristics may make it difficult for individuals to apply known social skills (Antshel et al. 2011). Thus, it is not sufficient to assess social knowledge when evaluating the efficacy of SSTPs; social performance should also be assessed.

Questionnaires are the most commonly used assessment method in the current SSTP literature. Nearly all studies in the SSTP literature use parent-report questionnaires as an assessment method, approximately half of the studies use child-report questionnaires as an assessment method, and a handful of the studies use teacher-, clinician-, or intervention staff-report questionnaires as an assessment method. Questionnaires can be used to index social knowledge or social performance, but they are most frequently used to index social performance. For example, the majority of questions on the Social Responsiveness Scale (Constantino 2004), a parent-report questionnaire often used to evaluate the efficacy of SSTPs, inquire about social performance in everyday social settings (e.g., “Has trouble keeping up with the flow of a normal conversation” and “Avoids eye contact or has unusual eye contact”). Despite its emphasis on social performance, questionnaire data must be interpreted with caution; questionnaire responses are subjective, and respondents are rarely blind to intervention status (Rao et al. 2008). Providers of the intervention, including clinicians and staff, and recipients of the intervention, including parents and children, may be positively biased in their report of the intervention. Although it is difficult to keep providers and recipients blind to intervention status and time consuming to recruit and retain teachers or other respondents not involved in the intervention, questionnaires completed by blind respondents are a useful assessment method for evaluating SSTPs (McMahon et al. in press).

Approximately half of the studies in the SSTP literature use social cognitive assessments to measure changes in social skills. In general, social cognitive assessments measure a child’s social cognitive skills in a lab-based setting; they do not measure how well a child can apply those skills to real-life social situations. For example, in a theory-of-mind assessment such as the Sally–Anne test (Baron-Cohen et al. 1985), a child is asked to determine what a character in a story is thinking. This assessment demonstrates a child’s capacity to think about another person’s thoughts (social knowledge), but it does not measure whether a child actually thinks about and/or responds to another person’s thoughts in everyday life (social performance). As such, social cognitive assessments should be combined with other assessment approaches that more clearly index social performance (McMahon et al. in press).

Finally, a few studies in the SSTP literature use observation of social behavior as an assessment method. This is the only assessment method that can directly examine social performance in a natural environment. As such, it has been considered the “most ecologically valid method of assessing children’s social skills” (Elliott and Gresham 1987, p. 97) and has emerged as the primary assessment method in the Applied Behavior Analysis (ABA) intervention literature (Vismara and Rogers 2010). Observation of a child’s behavior can be more time-consuming and resource-intensive than other methods and can be prone to measurement error if behavioral coders are unreliable or biased in their coding (McMahon et al. in press; Merrell 2001). However, the ecological validity of this assessment method and the success with which it has been used in the ABA literature strongly suggests further applications of this assessment method to the SSTP literature.

Observation of Social Behavior

Although observation of social behavior is common in the ABA literature, few studies in the SSTP literature have used data derived from direct observations. In the SSTP literature, there is not yet a clear consensus as to which social behaviors are most important to track over the course of an intervention; however, at least two themes are beginning to emerge: First, in several studies, researchers have evaluated the degree to which participants initiate and/or respond to a peer’s social interaction (Bauminger 2002; LeGoff 2004; Owens et al. 2008; Ruble et al. 2008). After participation in a SSTP, participants showed increased social initiations (Bauminger 2002; LeGoff 2004; Ruble et al. 2008) and social responses (Bauminger 2002; Ruble et al. 2008), indicating that these social behaviors may be malleable to intervention. Second, several studies have also tracked the frequency with which participants engage in peer interactions (Bauminger 2002, 2007a; Hillier et al. 2007; LeGoff 2004; Lerner and Mikami 2012; Owens et al. 2008). Both LeGoff (2004) and Owens et al. (2008) showed that participants spent more time interacting with peers at school after participation in a clinic-based SSTP. Likewise, Hillier et al. (2007) showed that participants interacted more frequently with one another during the later weeks of a SSTP compared to the earlier weeks of a SSTP. These results suggest that peer interaction may also be malleable to intervention. Thus, social initiation/response and peer interaction are emerging in the literature as social behaviors that are of theoretical importance and sensitive to intervention effects.

The Current Study

In the current study, we observed children’s social behavior weekly during a SSTP activity. This study extends the previous literature in several ways: (1) It is one of the first studies to examine changes in social behavior during the group time of a SSTP (e.g., Hillier et al. 2007; Lerner and Mikami 2012; Ruble et al. 2008). (2) It is also one of the first studies to use multiple data points to determine the extent to which social behavior changes over the course of a SSTP (Barry et al. 2003; Lerner and Mikami 2012). The use of multiple data points is advantageous because it allows true change to be differentiated from measurement error and it provides information about the shape of each person’s growth trajectory over time (Singer and Willett 2003). The current study used 19 data points to model changes in social behavior, compared to previous studies which have used 4 (Lerner and Mikami 2012) and 8 data points (Barry et al. 2003). Singer and Willett (2003) note that more data points allow for more reliable and precise estimates of change. (3) Finally, this is one of the first studies to examine predictors (e.g., age, gender, verbal IQ, intervention attendance) of change in social behavior over the course of a SSTP (Legoff 2004).

Consistent with the literature (Bauminger 2002; LeGoff 2004; Owens et al. 2008; Ruble et al. 2008), we coded children’s verbal speech as initiating, responding, or other (e.g., self-talk). We hypothesized that initiating and responding vocalizations would increase while other vocalizations would decrease over the course of the intervention.

Also consistent with the literature (Bauminger 2002, 2007a; Hillier et al. 2007; LeGoff 2004; Lerner and Mikami 2012; Owens et al. 2008), we coded the amount of time that children spent interacting with others in the SSTP. Interactions were coded as dyadic interactions, small group interactions, or time spent by self. Since intervention leaders tended to scaffold interactions, we further tracked whether dyadic interactions were with a peer or leader and whether small group interactions were with a group of peers only or a group of peer(s) and leader(s). We hypothesized that dyadic peer interactions and small group peer interactions would increase while dyadic leader interactions, small group peer and leader interactions, and time spent by self would decrease over the course of the intervention.

Methods

Participants

Participants enrolled in the Social Adjustment Enhancement Intervention, a fee-for-service clinical SSTP for children and adolescents with social-cognitive difficulties (adapted from Solomon et al. 2004), were recruited for the present study. To be enrolled in this intervention, families had to have contacted the University of California, Davis M.I.N.D. Institute to express interest in the intervention and met with a clinician to determine appropriateness for the intervention. Individuals with below average cognitive or language abilities, severe behavioral problems, and/or insufficient insurance or funds to pay for the intervention were referred for services elsewhere.

Participants enrolled in this clinic-based intervention (n = 28) were then contacted by research staff to determine their interest in and eligibility for the research project. To determine eligibility, participants were screened before and after coming into the research lab. In the initial phone screening, participants were required to have an ASD diagnosis from a community mental health professional. Fourteen participants met the initial screening criteria (13 individuals declined participation in the research project and 1 individual did not have an ASD diagnosis).

After coming into the research lab, participants were required to meet 2 of the following 3 diagnostic criteria: ≥60 on the Social Responsiveness Scale (SRS; Constantino 2004), ≥15 on the Social Communication Questionnaire (SCQ; Berument et al. 1999), and ≥15 on the Autism Spectrum Screening Questionnaire (ASSQ; Ehlers et al. 1999). Participants were also required to have a verbal IQ ≥ 65 on the Wechsler Abbreviated Scale of Intelligence (WASI; Wechsler 1999). All participants met inclusion criteria on this second screening, yielding a final sample size of 14 participants. See Table 1 for participant characteristics.

Table 1 Participant characteristics

Procedure

Intervention

Participants enrolled in the intervention were divided into small groups by age. Two school-aged groups (ages 10–13) and one adolescent group (ages 13–16) were formed (a fourth group of younger school-aged children did not participate in the present study). One of the school-aged intervention groups included two peers with typical development to serve as social role models and provide additional opportunities for peer social interaction. Participants attended the intervention for 1.5 h over 22 weeks. On average, participants attended 89 % of the intervention sessions (see Table 1).

During the intervention, participants were involved in the following weekly activities: a structured introduction time in which participants answered questions about themselves (e.g., Are your friends similar to you or different from you?), a didactic lesson time with topics ranging from friendship to conversation, an unstructured playground time, an unstructured game playing time (i.e., “Game Time”), and a structured joke telling time. During Game Time, participants generally played board and card games with one another; popular Game Time activities included Uno, Mancala, TinkerToys, and Jenga, among other games. At the end of each intervention session, participants were given a short homework assignment related to the lesson topic. Parents attended a concurrent psychoeducational group, and some siblings attended a concurrent support and recreational group. The intervention curriculum used in this study was adapted from Solomon et al. (2004); see Solomon et al. (2004) for a detailed description of the intervention curriculum, including lesson topics and sample activities.

Assessment

Prior to data collection, the study was approved by the Institutional Review Board at our university. Participants and their parents came into the lab within 6 weeks before the intervention began or within the first week of the intervention to give informed consent and complete the screening measures. Within 6 weeks after the intervention ended, parents completed an informal assessment of other interventions that their child had received since their first visit to the research lab.

Game Time was not structured by intervention staff and thus provided a platform to observe natural interactions among children. Behavioral coding occurred weekly during Game Time, lasting for an average duration of 16 min per week. Behavior was coded for 19 weeks: Behavior was not coded during the first 2 weeks of the intervention, which allowed participants to acclimate to their small groups, and behavior was not coded during the last week of the intervention, as graduation activities were substituted for Game Time.

Behavioral coders were not involved in delivering the clinical intervention. Eight undergraduate research assistants in the lab and the first author served as behavioral coders. Due to time constraints, behavioral coders received limited training before the intervention; coders received further training during weekly meetings throughout the intervention. If inconsistencies in coding were identified, they were discussed in the weekly training sessions to determine the most appropriate behavioral code. Behavioral coders rotated which participants they coded, such that they were not consistently matched to the same participant. Twenty-two percent of the behavioral coding was double coded, and Cronbach’s alpha was used to determine reliability for the social behavior summary scores (see data analyses).

Measures

Screening Measures

Autism Spectrum Screening Questionnaire (ASSQ; Ehlers et al. 1999): In this 28-item questionnaire, parents rate their child’s behaviors as being the same, somewhat different, or different from the behaviors of other children. Behaviors are those characteristic of Asperger Syndrome (e.g., idiosyncratic intellectual interests). This measure has been validated against clinical diagnosis and has shown good reliability.

Social Communication Questionnaire (SCQ; Berument et al. 1999): This parent-report questionnaire focuses on reciprocal social interaction, communication, and repetitive and stereotyped patterns and behaviors. It was developed from the 40 critical items of the Autism Diagnostic Interview (ADI; Lord et al. 1994), a gold-standard diagnostic tool; it correlates strongly with the ADI and shows high reliability.

Social Responsiveness Scale (SRS; Constantino 2004): In this 65-item questionnaire, parents report on their children’s social awareness, cognition, communication, motivation, and mannerisms. This questionnaire has been validated against the ADI and shows high reliability.

Wechsler Abbreviated Scale of Intelligence (WASI; Wechsler 1999): This assessment uses the Vocabulary and Similarities Subtests to estimate verbal IQ and the Block Design and Matrix Reasoning Subtests to estimate performance IQ. The WASI has excellent reliability for both children and adults, and it has been validated against other tests of intelligence, including the Wechsler Intelligence Scale for Children.

Measure of Participation in Additional Interventions

Additional Interventions Questionnaire: This informal questionnaire was designed by research staff to index participants’ involvement in additional interventions. After the SSTP had ended, participants’ parents were asked to indicate whether their child had participated in behavioral, psychological, speech-language, occupational, sensory integrative, physical, and/or other types of therapy since their first visit to the research lab. In addition, parents were asked to indicate whether their child had participated in an additional SSTP elsewhere.

Behavioral Coding System

The behavioral coding system used in the current study was an adaptation of Bauminger’s coding system (2002, 2007a).

Vocalizations: Participants’ vocalizations were coded as Initiating, Responding, or Other. Vocalizations directed toward another person in the absence of a conversation were coded as Initiating. Vocalizations directed toward another person in the presence of a conversation (i.e., within approximately 10 s of a previous vocalization) were coded as Responding. Vocalizations that were not clearly directed toward another person (e.g., self-talk) were coded as Other. The frequency and types of vocalizations within a 20-s interval were coded. Vocalizations were only coded once, such that a vocalization longer than 20 s was only coded in the first 20-s interval. Reliability was excellent for Responding vocalizations (α = 0.95), good for Other vocalizations (α = 0.83), and acceptable for Initiating vocalizations (α = 0.74).

Interactions: Talking with or engaging in an activity with another person was coded as an interaction. Participants’ interactions were coded as Peer, Leader, Group of Peers, Group of Peer(s) and Leader(s), or Self. Participants interacting one-on-one with a peer or intervention leader were respectively coded as Peer or Leader. Participants interacting with a small group of individuals were respectively coded as Group of Peers or Group of Peer(s) and Leader(s), depending on whether a leader was present in the group. Participants not interacting with others were coded as spending time by Self. The types of interactions occurring within a 20-s interval were coded, such that more than one type of interaction could occur in a 20-s interval. Reliability was excellent for interacting with a Leader (α = 0.94), interacting with a Group of Peers (α = 0.93), interacting with a Group of Peer(s) and Leader(s) (α = 0.96), and spending time by Self (α = 0.92), and reliability was good for interacting with a Peer (α = 0.89).

Data Analyses

Social Behavior Summary Scores

Social behavior summary scores were calculated, such that participants had a summary score for every coded variable at each intervention session. To calculate summary scores for vocalizations, the number of vocalizations across all of the coding blocks was summed and divided by the total number of coding blocks. This yielded an average number of Initiating, Responding, and Other vocalizations per 20-s coding block for a given participant on a given intervention session. To calculate summary scores for interactions, the number of coding blocks during which a participant engaged in a particular interaction were counted and divided by the total number of coding blocks. This yielded a proportion of time spent in Peer, Leader, Group of Peers, and Group of Peer(s) and Leader(s) interactions and a proportion of time spent by Self for a given participant on a given intervention session. Since participants could engage in more than one type of interaction in a 20-s interval, the summary scores for interactions are not dependent on one another and do not add to one. If a participant was absent from the intervention room for an entire coding block (e.g., taking a bathroom break), that coding block was not included in any of the calculations for the social behavior summary scores.

Hierarchical Linear Modeling (HLM)

Two-level Hierarchical Linear Modeling (HLM) models were used to examine the data, with weekly social behavior summary scores nested within persons. HLM was used to analyze (1) social behavior at the beginning of the intervention and (2) changes in social behavior throughout the intervention. HLM analyses using the restricted maximum likelihood approach were run separately for each dependent variable: Initiating, Responding, and Other (vocalizations); and Peer, Leader, Group of Peers, Group of Peer(s) and Leader(s), and Self (interactions). As recommended by Raudenbush and Bryk (2002), we used a “step up” strategy for model building, rather than a “saturated” strategy. For each of the dependent variables, we engaged in 9 steps of model building. At each step, we tested a component of the model and only retained that component if it was significant. Across all steps, intervention week was included as a predictor of the slope (β 10), as it was a variable of theoretical interest and provided a test for the hypothesis of change in social behavior over the intervention.

In the first step, we tested the variance component at the intercept (i.e., variability in social behavior at the beginning of the intervention; r 0i ), and in the second step, we tested the variance component at the slope (i.e., variability in the rate of change of social behavior from the beginning to the end of the intervention; r 1i ). In the third through fifth steps, we tested age in years (β 01), gender (β 02), and verbal IQ (β 03) as predictors of the intercept (i.e., predictors of social behavior at the beginning of the intervention), and in the sixth through ninth steps, we tested age in years (β 11), gender (β 12), verbal IQ (β 13), and number of intervention sessions attended (β 14) as predictors of the slope (i.e., predictors of the rate of change in social behavior from the beginning to the end of the intervention). The number of intervention sessions attended did not include the final intervention week, as behavior was not coded during the final week. Age in years, verbal IQ, and number of intervention sessions attended were centered around the grand mean, and gender was coded such that 0 = male and 1 = female. The final model contained intervention week and any other predictors that were significant during model building.

Results

Participation in Additional Interventions

Some parents indicated that their child participated in behavioral (n = 2), psychological (n = 2), speech-language (n = 8), occupational (n = 1), sensory integrative (n = 1), physical (n = 1), and/or other (n = 1) types of therapy since their first visit to the research lab. In addition, three parents indicated that their child participated in an additional SSTP elsewhere.

Social Behavior Summary Scores

See Table 2 for mean social behavior summary scores at the beginning, middle, and end of the behavioral coding weeks and across all behavioral coding weeks. See Table 3 for a summary of all HLM models. Note that the results for the variance components are reported in Table 3, but these results are not presented in the text.

Table 2 Mean social behavior summary scores at the beginning, middle, and end of the behavioral coding weeks and across all behavioral coding weeks
Table 3 Hierarchical linear models

Initiating Vocalizations

Older participants made significantly fewer Initiating vocalizations at the beginning of the intervention than younger participants, t(12) = −2.34, p = 0.04. Participants significantly decreased in the number of Initiating vocalizations made from the beginning to the end of the intervention, controlling for the effects of age at the beginning of the intervention, t(215) = −5.51, p < 0.01 (see Fig. 1).

Fig. 1
figure 1

Change in the frequency of individual participants’ Initiating (a), Responding (b), and Other (c) vocalizations and the proportion of time individual participants spent interacting with a Leader (d) and with a Group of Peers (e) over the course of the intervention. Note that individual participant slopes only vary when the variance component for the slope is significant and thus retained in the model

Responding Vocalizations

There were no significant predictors of the number of Responding vocalizations made at the beginning of the intervention. Participants significantly increased in the number of Responding vocalizations made from the beginning to the end of the intervention, t(215) = 2.09, p = 0.04 (see Fig. 1).

Other Vocalizations

There were no significant predictors of the number of Other vocalizations made at the beginning of the intervention. Participants significantly decreased in the number of Other vocalizations made from the beginning to the end of the intervention, t(251) = −3.26, p < 0.01 (see Fig. 1).

Interaction with a Peer

Older participants spent significantly more time interacting with a Peer at the beginning of the intervention than younger participants, t(12) = 3.01, p = 0.01. Participants who attended more intervention sessions showed a significantly steeper increase in the amount of time spent interacting with a Peer from the beginning to the end of the intervention compared to participants who attended fewer intervention sessions, controlling for the effects of age at the beginning of the intervention, t(214) = 2.61, p = 0.01.

Interaction with a Leader

There were no significant predictors of time spent interacting with a Leader at the beginning of the intervention. Participants spent marginally less time interacting with a Leader from the beginning to the end of the intervention, t(13) = −1.79, p = 0.10 (see Fig. 1).

Interaction with a Group of Peers

Males spent significantly less time interacting with a Group of Peers at the beginning of the intervention than females, t(12) = 3.14, p = 0.01. Participants spent significantly more time interacting with a Group of Peers from the beginning to the end of the intervention, controlling for the effects of gender at the beginning of the intervention and age and intervention attendance throughout the intervention, t(213) = 2.44, p = 0.02 (see Figs. 1, 2). Younger participants showed a significantly steeper increase in the amount of time spent interacting with a Group of Peers from the beginning to the end of the intervention than older participants, controlling for the effects of gender at the beginning of the intervention and intervention attendance throughout the intervention, t(213) = −2.61, p = 0.01 (see Fig. 2). Participants who attended fewer intervention sessions showed a significantly steeper increase in the amount of time spent interacting with a Group of Peers from the beginning to the end of the intervention compared to participants who attended more intervention sessions, controlling for the effects of gender at the beginning of the intervention and age throughout the intervention, t(213) = −2.23, p = 0.03 (see Fig. 2).

Fig. 2
figure 2

The proportion of time spent interacting with a Group of Peers over the course of the intervention for participants at one standard deviation above and below the mean age and one standard deviation above and below the mean number of intervention sessions attended

Interaction with a Group of Peer(s) and Leader(s)

There were no significant predictors of time spent interacting with a Group of Peer(s) and Leader(s) at the beginning of the intervention. Also, there were no significant predictors of time spent interacting with a Group of Peer(s) and Leader(s) from the beginning to the end of the intervention.

Time Spent by Self

Males spent significantly more time by Self at the beginning of the intervention than females, t(12) = −2.86, p = 0.01. There were no significant predictors of time spent by Self from the beginning to the end of the intervention.

Summary

At the beginning of the intervention, males spent less time interacting with a Group of Peers and spent more time by Self compared to females. Also, older participants made fewer initiating Vocalizations and spent more time interacting with a Peer compared to younger participants. From the beginning to the end of the intervention, participants made fewer Initiating vocalizations, more Responding vocalizations, fewer Other vocalizations, spent more time interacting with a Group of Peers, and spent marginally less time interacting with a Leader. Younger participants showed a steeper increase in the amount of time spent interacting with a Group of Peers from the beginning to the end of the intervention compared to older participants. Participants who attended more intervention sessions showed a steeper increase in the amount of time spent interacting with a Peer and steeper decrease in the amount of time spent interacting with a Group of Peers from the beginning to the end of the intervention compared to participants who attended fewer intervention sessions.

Discussion

Overall, this study shows that behavioral coding can be a useful assessment method for examining weekly changes in social behavior over the course of a SSTP. Using this assessment method, participation in a SSTP was associated with positive changes in social behavior, including increased vocalizations directed towards peers and increased interactions with peers, during the game-playing activity time of an intervention. Both age and gender predicted participants’ social behavior at the beginning of the intervention while age and intervention attendance predicted changes in participants’ social behavior over the course of the intervention.

Intervention Effects

Consistent with our hypothesis, Responding vocalizations increased and Other vocalizations decreased over the course of the intervention. These results indicate that participants more frequently responded to vocalizations from others and less frequently engaged in non-directed speech as the intervention progressed. The intervention curriculum used in this study included didactic lessons and activities for teaching conversational skills, and these results may suggest that participants learned and applied the conversational skills taught in the curriculum (Solomon et al. 2004). An alternative explanation for these findings is that interacting with unfamiliar peers and intervention leaders at the beginning of the intervention was more stressful and anxiety-provoking than interacting with familiar peers and intervention leaders at the end of the intervention (Lopata et al. 2008). Thus, both didactic lessons on conversational skills and greater familiarity with peers and intervention leaders may have facilitated more responsive and directed speech in participants, and future studies will be required to differentiate between these two effects.

Contrary to our hypothesis, Initiating vocalizations decreased over the course of the intervention. Decreased Initiating vocalizations may be the natural result of increased Responding vocalizations; longer and/or more frequent conversations among group members may have limited the need for participants to initiate new conversations. This pattern of increased Responding and decreased Initiating vocalizations may be indicative of more back-and-forth conversations and fewer choppy conversations with awkward pauses and new initiations. However, as pauses in conversation were not coded, it’s not clear how long conversations were sustained or how frequently they occurred. In addition, coders had the most difficulty establishing reliability for Initiating vocalizations, such that this finding may partially reflect coders’ additional training and expertise in identifying Initiating vocalizations as the intervention progressed.

In a sample of children with ASD ages 8–17, Bauminger (2002) found that participants were more likely to initiate a social interaction than respond to a social interaction, regardless of their participation in a SSTP. Although social initiations (e.g., Initiating Joint Attention; IJA) are considered to be more advanced than social responses (e.g., Responding to Joint Attention; RJA) in the infant literature (Mundy and Newell 2007), social response may become increasingly important with development. For example, Gillespie-Lynch et al. (2012) recently showed that social responses (RJA) in early childhood were more predictive of social and communicative skills in adulthood than social initiations (IJA).

For older children and adolescents with ASD, social response may be a more difficult conversational skill than social initiation: Social response tends to require awareness of a conversation partner’s speech and thoughts in order to stay on track with the conversation whereas social initiation mainly requires the availability of a conversation partner. In the current study, children and adolescents improved in social response, a potentially more advanced and complex conversational skill for children and adolescents than social initiation. However, caution must be taken in interpreting this result, as the current study did not track whether participant’s responses to peers were relevant and/or meaningfully extended the conversation.

As the intervention progressed, participants spent more time interacting with a Group of Peers and spent marginally less time interacting with a Leader. These results are consistent with our hypothesis and suggest that participants engaged in more peer interaction and required less social scaffolding from leaders over the course of the intervention. The other interaction variables did not reach significance, potentially due to sample size and power limitations.

Bauminger (2007a) found that children with ASD were more likely to interact in dyads than in small groups, regardless of their participation in a SSTP. Since social demands tend to increase with more social stimuli, small group interaction may be more difficult than dyadic or one-to-one interaction (Bauminger 2007b). In the present study, children and adolescents showed an increase in small group peer interaction, which may be a more complex social skill than dyadic peer interaction.

Although interaction with a Leader decreased over the course of the intervention, it is not clear whether this result was due to changes in the participants’ behavior or the leaders’ behavior. Intervention leaders were instructed to fade out of social interactions as the intervention progressed in order to promote peer interaction. Thus, this result may partially reflect purposeful changes in the leaders’ behavior. Nonetheless, regardless of the leaders’ behavior, participants were able to maintain peer interactions with less involvement and scaffolding from leaders.

Developmental Effects

At the beginning of the intervention, older participants spent more time interacting with a Peer than younger participants. Throughout the intervention, younger participants, particularly those who attended fewer intervention sessions, showed a steeper increase in the amount of time spent interacting with a Group of Peers compared to older participants. Overall, these results suggest that older participants spent more time in dyadic interactions while younger participants spent more time in small group interactions. As peer interactions were embedded within the context of Game Time, these results may simply be representative of game preferences. Older participants may have preferred two-player games (e.g., Mancala) while younger participants may have preferred multiple player games (e.g., building with TinkerToys). While game preference is the most parsimonious and most likely explanation of these results, an alternative explanation of the results is a true developmental trend in peer interactions. These results may indicate that adolescents with ASD have smaller peer or friendship networks than children with ASD. This conclusion would be consistent with a recent study showing that older adolescents with ASD have fewer friendships than younger adolescents with ASD (Kuo et al. 2011) and would represent a departure from the typical development literature, in which friendship networks expand from childhood to adolescence (Feiring and Lewis 1991; Levitt et al. 1993). Conversely, these results may suggest that adolescents are more likely to engage in dyadic interactions and build intimacy and friendship with specific peers, while children are less likely to form intimate friendships with specific peers. This finding would be consistent with the typical development literature, in which quality and intimacy of friendships tend to increase across adolescence (e.g., Berndt 2004; McNelles and Connolly 1999; Way and Greene 2006). However, this study did not track whether participants repeatedly interacted with the same peers over the course of the intervention; thus, these results cannot be conclusively interpreted at this time.

At the beginning of the intervention, younger participants made more Initiating vocalizations than older participants. Frequency of initiating vocalizations may be related to number of interaction partners: Younger participants may have spent more time in small group interactions, thus yielding more interaction partners and potentially more opportunities to initiate conversation; older participants may have spent more time in dyadic interactions, thus yielding one interaction partner and potentially fewer opportunities to initiate conversation.

Gender Effects

At the beginning of the intervention, males spent less time interacting with a Group of Peers and spent more time by Self than females. This result is consistent with both the ASD literature and the typical development literature suggesting that girls prefer more interactive activities than boys. Kuo et al. (2011), for example, examined friendship in adolescents with ASD; they found that males tended to engage in passive activities with friends (e.g., watching television, playing video games) while females preferred more interactive activities with friends (e.g., talking, spending time together). In the typical development literature, de Bruyn and Cillessen (2008) reported that female adolescents spent more leisure time engaged in social activities (e.g., shopping, seeing movies, talking to friends) while male adolescents spent more leisure time engaged in sport, car, and computer activities (e.g., basketball, attending car shows, computer gaming). Although gender differences were present at the beginning of this intervention, there were no gender differences in participants’ vocalizations or interactions as a result of the intervention, suggesting that SSTPs may be effective for both males and females with ASD.

Verbal IQ Effects

There were no effects of verbal IQ in the current study. Verbal IQ may have influenced which participants played games together; anecdotally, the adolescent intervention group seemed to form cliques according to cognitive functioning level. However, verbal IQ did not influence participants’ vocalizations and interactions, as operationally defined in the current study. Consistent with this result, Legoff (2004) also found that improvements in social behavior over the course of a SSTP were not associated with IQ. Overall, these studies suggest that SSTPs have a similar effect on social behavior for higher-functioning and lower-functioning children with ASD. Participants in the current study, however, were required to have a verbal IQ ≥ 65, so it is not clear whether SSTPs are also effective for children with ASD and comorbid intellectual disability.

Intervention Attendance Effects

As participants attended more intervention sessions, they were more likely to engage in dyadic interactions with a Peer and less likely to engage in interactions with a Group of Peers. Greater intervention attendance may have allowed participants to become more familiar with other group members, which may have facilitated the development of friendships with specific peers. However, as noted earlier, data on whether participants repeatedly interacted with the same children over time were not collected, thus limiting the conclusions that can be drawn from the effects of intervention attendance at this time.

Limitations

There are a few limitations in the current study that are worth noting. Although this study demonstrated positive changes in social behavior over the course of a SSTP, the cause of these changes cannot be determined. The Social Adjustment Enhancement Intervention curriculum may have led to these changes in behavior, or as children become more comfortable with one another throughout the intervention, they may have conversed and interacted with one another more readily. Thus, repeated interactions with peers in a safe, supported environment may have led to these changes in behavior, as opposed to (or in addition to) the social skills curriculum. While it is unclear which explanation is the most accurate (or whether both explanations are accurate), both explanations ultimately result in a positive social outcome for children with ASD and their families.

Game Time was not standardized across intervention groups, across children, or across intervention weeks. Not all intervention groups had the same selection of games available. Only one intervention group included peers with typical development (n = 2). Some children regularly played two-player games while other children regularly played multiple-player games. Also, on a given intervention week, leaders would sometimes pair two (or more) children to play together during Game Time. Although rare, Game Time would occasionally start before behavioral coders were present and prepared to code. This variability in Game Time is reflective of a clinical SSTP in which the goal of the program is to promote social behavior in children and standardization of the program is less important. However, given that behavioral coding occurred across 19 time points, the results of the present study should be robust to minor procedural fluctuations in Game Time.

Behavioral coders were not blind to intervention status, as all behavioral coding occurred during the context of the intervention. In addition, due to time constraints, behavioral coders received limited training and did not have an opportunity to achieve reliability before the intervention. Coders did, however, participate in weekly training sessions throughout the intervention. Behavioral codes were adapted or clarified during the weekly training sessions, as needed. As coders gained more experience and training in the coding system throughout the intervention, they may have become more adept at coding social behaviors; thus, changes in social behavior throughout the intervention may be partially associated with changes in coders’ expertise. However, as there was excellent to acceptable reliability among coders, social behaviors seem to have been relatively intuitive and easy to code without expertise knowledge. Since behavioral coding occurred across 19 time points, as noted earlier, the results of the present study should be robust to moderate measurement error.

Low-level social interaction was not differentiated from an absence of social interaction in the current study. Parallel play, in which a child played alongside another child but did not actively engage with that child, was frequently observed during Game Time. Since children were not directly interacting with peers or leaders during parallel play, this behavior was coded as time spent by Self. However, the frequency with which parallel play was observed suggests that it should have been coded separately from time spent by Self.

It is unclear whether participants in the current sample are representative of higher-functioning children and adolescents with ASD. Participants in this sample had to meet two tiers of inclusion/exclusion criteria. First, potential participants met with clinicians to determine appropriateness for the intervention, and second, potential participants met with research staff to determine interest in and eligibility for the research study. As many potential participants were excluded from or declined to participate in the study, the resulting sample may not have been a representative sample. However, the exclusion of children in clinic-based treatment programs for clinical and/or payment purposes is consistent with the reality of clinical treatment. Furthermore, given that participants first enrolled in a clinical intervention and second learned of and enrolled in a research study, this sample may have included some participants who would not typically initiate interest in or self-select into a research study. The sample size for the current study was small (n = 14), which may also limit generalizability of results to other higher-functioning children and adolescents with ASD.

In the present analyses, we did not control for participants’ participation in additional interventions, as such participation was assessed via an informal questionnaire with unknown psychometric properties. Also, we did not measure whether changes in social behavior were maintained after the intervention or generalized beyond the intervention setting. In particular, as social behavior was evaluated during Game Time, it is unclear whether changes in social behavior generalized beyond Game Time. While Game Time was not structured by intervention staff, games have rules which tend to provide an overall framework for social engagement. As such, changes in social behavior may not have generalized to unstructured activities without rules.

Finally, parent-report questionnaires were used to confirm ASD diagnosis in the current study. In the future, it would be helpful to also confirm ASD diagnosis with an observational assessment, such as the Autism Diagnostic Observation Schedule (ADOS; Lord et al. 2002).

Future Directions

In future research, it will be important to include a control group of children and adolescents with ASD that meets regularly, but does not receive the intervention curriculum. By comparing the social behavior of the control group and the intervention group over time, it will be possible to determine whether changes in social behavior are due to the intervention curriculum or due to regular interactions with peers in a safe environment. Relatedly, few research studies have examined the effects of recreational activities on the social behavior of children with ASD. Regular peer interactions through recreational activities (e.g., playing soccer, being on a debate team) may yield positive changes in social behavior.

The present study builds off of the research of Bauminger (2002, 2007a) in suggesting that social responses and small group interactions may be key social skills for children and adolescents with ASD. Future research is needed to determine whether these skills are more challenging for children and adolescents with ASD than social initiations and dyadic interactions. Further research is also needed to evaluate whether these skills increase in complexity across development, such that school-aged children and adolescents with ASD have more difficulty mastering these skills than younger children with ASD.

In future studies, it would be informative to record with whom a participant is conversing or interacting. In this way, social exchanges with specific individuals can be tracked over the course of a SSTP, potentially providing insight into the development of friendships in ASD. If certain conversation or interaction patterns tend to facilitate the development of friendship, SSTP curriculums could be adapted to encourage these patterns of behavior. In addition, parallel play may be an important developmental precursor to integrated peer interactions (Bakeman and Brownlee 1980) and should be coded in future studies.

Behavioral coding is a promising assessment approach in the SSTP literature, and the current study demonstrates that this approach can be used to regularly assess changes in social skills over the course of a SSTP. To ensure appropriate training and reliability among behavioral coders in future work, coders can receive training and demonstrate reliability before the intervention begins and/or the intervention can be videotaped and later coded by trained and reliable coders. While the disadvantage of behavioral coding is that it is more time-consuming and resource-intensive than other assessment methods, behavioral coding is the only assessment method that can directly evaluate social performance in a natural environment (McMahon et al. in press). Furthermore, insurance companies often impose requirements on funding (Dingfelder and Mandell 2011) and may prefer and/or mandate this assessment method. Given the value of behavioral coding and the preference of some insurance companies for this assessment method, future research should investigate ways to code behavior that are less time-consuming and resource-intensive. Automated computer coding, a behavioral coding method that is currently under development, may prove to be useful in this regard (e.g., Messinger et al. 2009).