Introduction

Children with high functioning autism spectrum disorders (HFASDs; i.e., high functioning autism [HFA], Asperger’s, and PDDNOS) demonstrate a number of core features that significantly affect social performance and serve as the basis for interventions. Central to the impairments is an ongoing significant and pervasive deficit in social interaction skills (American Psychological Association [APA] 2000; Church et al. 2000). Numerous studies and authors have described significant social deficits such as rigid and poor play skills (Church et al. 2000), impaired understanding of what constitutes a “friend” (Carrington et al. 2003), an inability to negotiate and compromise (Marks et al. 1999), failure to recognize personal space boundaries (Parsons et al. 2004), and overreliance on inflexible and formal social rules (Klin et al. 2005). Contributing to the social difficulties are impairments in the ability to infer the internal mental states of others (i.e., perspective taking) and accurately identify emotional content in the face and vocal expressions of others (Golan and Baron-Cohen 2006; Howlin et al. 1999; Marans et al. 2005).

Children with HFASDs are also characterized by restricted, repetitive, and stereotyped patterns of interests, activities, or behaviors which can be evidenced in preoccupation with a narrow area of interest and/or parts of objects, rigid adherence to nonfunctional rituals and routines, and stereotyped motor mannerisms (APA 2000). Their circumscribed interests often become intrusively absorbing, consuming the child’s attention and restricting her/his ability to participate in reciprocal interactions (Klin et al. 2000). For example, the children may have difficulty remaining in a conversation on a topic outside their area of interest (Klin et al. 2005). Their need for routines and rituals can also be problematic as unexpected changes to set patterns and schedules can result in stress and acting-out behaviors (Church et al. 2000; Simpson and Myles 1998).

While children with HFASDs generally demonstrate a relative strength in formal language, social language and communication deficits (e.g., atypical inflection, unusual gestures, etc.) have been noted (APA 2000; Church et al. 2000; Volkmar and Klin 2000). Difficulties involving the use of imagery, poor abstraction, impaired interpretation of non-literal language, and an absence of emotionally-laden language have been identified as contributing to their social impairments (Carrington et al. 2003; Howlin et al. 1999; Klin et al. 2005; Little 2002). The broad and pervasive nature of these core deficits requires comprehensive interventions that target both social and communication skills (Klin and Volkmar 2000).

Despite a lack of well-controlled research (Kasari and Rotheram-Fuller 2005), several general treatment guidelines have been proposed and preliminary evidence has supported several techniques that take advantage of the children’s cognitive and language strengths using cognitive and behavioral methodologies. Several authors have asserted the need for explicit skill teaching in which complex social behaviors are deconstructed into discrete components and taught in a part-to-whole manner, beginning with basic and progressing to more complex skills (Howlin et al. 1999; Klin and Volkmar 2000). Techniques such as teaching, modeling, role-playing, and performance feedback are also common and have been associated with positive social outcomes (e.g., Barnhill et al. 2002; Cragar and Horvath 2003; Lopata et al. 2006; Marriage et al. 1995). Highly structured and predictable environments in which instruction can be delivered and performance feedback promptly provided have also been identified as beneficial (Klin and Volkmar 2000; Simpson and Myles 1998). Providing immediate performance feedback and reinforcement promotes more rapid skill acquisition and maintenance, as well as provides the opportunity to correct social and behavioral errors (Attwood 2000; Howlin et al. 1999; Myles and Simpson 2001).

Group formats are also commonly used for social skills intervention with these children as they reportedly afford some advantages. According to Solomon et al. (2004), groups provide opportunities for peer interaction and practice in a more naturalistic setting. In addition, children have the opportunity to develop social relationships that may continue outside the group sessions (Carter et al. 2004). When considering the use of a group, an additional question involves whether the group should be composed of all children with HFASDs or a combined group of typical children and children with HFASDs. One of the positive aspects of a homogeneous group for children with HFASDs is that it allows them to work on their unique skill deficits in a safe setting with peers who share their problems and experiences (Marriage et al. 1995; Mishna and Muskat 1998). Although homogeneous groups may be appropriate for intensive skills instruction, opportunities for social integration and practice are also necessary (Klin and Volkmar 2000).

While a number of intervention techniques have been described, research on social interventions for these children is lacking. The existing research has been characterized by a lack of randomized group-based treatment studies and few manualized programs (White et al. 2007). Two studies that reported social improvements for children with HFASDs using randomized designs and group formats were conducted by Solomon et al. (2004) and Lopata et al. (2006). Solomon et al. (2004) examined the effectiveness of a social enhancement program for 18 children with HFASDs (9 who received treatment and 9 matched wait-list control children). Children participated in small groups (3:5 staff–child ratio) 90 min per week for a total of 20 weeks. The intervention targeted theoretically-derived skills in face and emotion recognition, group and individual problem-solving, and perspective taking. Instruction, practice, and reinforcement were provided using a visual template, role playing, and games. Parents also participated in an educationally-based group. Using standardized measures the program was found to be effective in increasing emotion recognition and problem solving, however both groups improved in their perspective-taking skills. While noting limitations, the authors suggested that group social treatments have potentially positive affects for these children.

Lopata et al. (2006) examined the effectiveness of an intensive manualized summer social development program for 21 six to 13-year olds with AD. The full-day six-week program used small groups (approximately 3:6 staff–child ratio) and targeted four core deficit areas including social skills, face-emotion recognition, range of interests, and interpretation of non-literal language. Instructional methods involved teaching, modeling, role-playing, and performance feedback during social skills groups, as well as therapeutic activities developed to prompt and reinforce skills taught during the groups. While all participants received the same manualized treatment and curriculum and high rates of explicit performance feedback throughout the day, half were randomly assigned to receive performance feedback using a response-cost point system and the others received feedback with no preset behavioral categories or contingencies. Parent training was also provided for 90 min one time per week. Treatment fidelity was assessed during staff training and throughout the program using a standardized fidelity sheet delineating specific protocol requirements. All staff members were required to demonstrate a minimum of 90% fidelity during training. During program implementation, treatment fidelity over the two summer programs was 87%. Pre-post parent and staff ratings indicated significant improvements in social skills, as well as significant improvement in adaptability and a significant decrease in atypicality on parent reports for the overall program. Time by performance feedback condition interactions were not significant, suggesting that neither format was associated with a significantly better outcome. The authors cautioned that the findings required replication and the small sample may have hindered detection of differences based on performance feedback type.

While such studies provide preliminary support for the effectiveness of cognitive and behavioral approaches for social enhancement of children with HFASDs, treatment research is clearly lacking. Following a review of current research trends in HFASDs, Kasari and Rotheram-Fuller (2005) claimed that “[t]reatment and outcome studies remain remarkably sparse but are critically needed for this population” (p. 500). Intervention programs are needed that specifically address the core social and communicative areas known to characterize children with HFASDs, as well as individualized skill targets (Klin and Volkmar 2000; Tsatsanis et al. 2004). An NIMH supported working group recently developed a model for validating psychosocial interventions for ASD. Smith et al. (2007) proposed a four-phase model to serve as a “road map” for conducting psychosocial intervention research. The four phases progress from development and systematic testing of new techniques, to development and pilot testing of a manualized protocol including development of fidelity measures, to randomized clinical trials, and finally community-based effectiveness studies. Recently, White et al. (2007) reviewed 14 group-based social skills interventions and found that the majority did not use a treatment manual, which they identified as necessary for replication studies and randomized clinical trials. Further, they noted that none of the reviewed studies used random assignment. The authors concluded that there is a need for development of manualized curricula and intervention procedures that allow for assessment of treatment fidelity, as well as replication. They also noted the potential benefits of multiple informants to strengthen outcome measurement.

This study presents results of years three and four of a four-year study examining the effectiveness of a manualized intensive summer social development program on the social performance of 6–13 year olds with HFASDs. Years three and four included a larger sample that allowed for an expanded number of social performance indicators beyond those previously reported for years one and two. The three goals of this study were to: (1) replicate findings from years one and two using similar and additional measures; (2) compare the outcomes of participants who received a response-cost point system to those who received non-categorical feedback; and (3) provide further evidence of program feasibility (i.e., treatment integrity, parent satisfaction, and evidence of efficacy).

Method

Participants

At present, there is ongoing debate regarding the extent to which the three disorders that comprise HFASDs (HFA, AD, and PDDNOS) can be reliably distinguished (Klin et al. 2005; Miller and Ozonoff 2000; Ozonoff and Griffith 2000). Relative strengths in cognitive ability and language distinguish these children from those on the autism spectrum with more significant language and cognitive impairments (Klin and Volkmar, 2000); however their strengths also complicate differential diagnosis (Kasari and Rotheram-Fuller 2005; Kim et al. 2000). To date many researchers have included children with AD, HFA, and PDDNOS in their intervention studies (e.g., Barnhill et al. 2002; LeGoff 2004; Solomon et al. 2004) due to their shared social and communicative characteristics.

Recognizing the ongoing problems with differential diagnosis, the current study included children with AD, HFA, and PDDNOS. A total of 54 children ages 6–13 diagnosed with AD, HFA, or PDDNOS participated in this study. The children were recruited over a two-year period from local school districts, clinics, and parent support groups in the Western New York area using flyers and public notices. All children met specific inclusion criteria using a multiple-gate screening procedure. This procedure was established based on numerous studies that have used formal written diagnosis of a HFASD by a licensed mental health professional and records review in their determination of eligibility (e.g., Abell and Hare 2005; Barnhill et al. 2002; LeGoff 2004; Martin et al. 1999). Inclusion criteria included a formal written diagnosis of AD, Autism, or PDDNOS by a licensed psychologist or psychiatrist, WISC-IV short-form IQ composite > 70, an index score ≥ 80 on at least one factor of the WISC-IV, and the absence of a current significant language delay. The first gate required submission of a written diagnosis of a HFASD, and all relevant psychological and psychiatric reports and special education records. Upon receipt, the case was moved to the second gate where two members of the senior research team independently reviewed the written reports using a standardized checklist encompassing DSM-IV-TR criteria (i.e., social interaction impairments, and restricted repetitive and stereotyped patterns of behaviors or interests; APA 2000), cognitive ability, and current language levels. Each reviewer independently made a determination as to whether the data in each child’s record supported the presence of a HFASD. Agreement between the senior researchers was required before moving the case to the third gate. A total of 94 records were reviewed in the second gate and 56 were determined by both senior researchers to have information consistent with a HFASD (this does not include children who repeated the program in the second year of the study). Failure to move a child to the third gate was almost exclusively due to indication of a cognitive deficit or current language delay in the psychological, psychiatric, and/or special education reports. In the third gate, children participated in an assessment involving cognitive testing (i.e., WISC-IV short form) and informal observation of their social behaviors. Following the assessment, two senior research team members again reviewed the reports and current results of the cognitive testing and informal observations using the standardized checklist and determined whether results were again consistent with a HFASD. Agreement was required for inclusion in the study. Of the 56 children tested in the third gate, only two were rejected and this was due to significant problems with physical aggression.

This multiple-gate screening resulted in 72 children qualifying for the program over the two summers (36 per year). A total of 18 children participated in both years of the current study. In such cases, only data from their first year was included in the analyses resulting in 54 children in the current study. Characteristics of the sample are described in Table 1. The participants were overwhelmingly male (92.6%) and Caucasian (88.9%), with a mean parent education level of 15.58 (SD = 2.30) years. The diagnostic breakdown was 66.7% with AD, 22.2% with PDDNOS, and 11.1% with HFA. Once accepted into the study, participants were matched on age, diagnosis, and gender and randomly assigned to treatment conditions (i.e., performance feedback conditions). This process resulted in a total of 25 children assigned to the response-cost point condition and 29 children to the non-categorical feedback condition. All of the children who began the study completed the summer treatment program except for one who discontinued due to significant deterioration associated with a psychotic episode.

Table 1 Demographic characteristics of participants

Measures

WISC-IV Short Form

A four subtest short form of the Wechsler Intelligence Scale for Children-4th Edition (WISC-IV; Wechsler 2003) was utilized to obtain an estimate of general intelligence as part of the screening process. The four subtests administered were Vocabulary, Similarities, Block Design, and Matrix Reasoning. The composite score derived from this short form has an internal consistency reliability of .95 and correlates .92 with the Full Scale IQ of the complete test. The methods described by Tellegen and Briggs (1967) were utilized to calculate the composite reliability, correlation with the full test, and deviation quotient formula of the short form based on standardization information available in the test manual.

Several measures were used to assess the effectiveness of the program. These measures and subscales were selected as they assess skills targeted by the program, and represent behaviors that affect social performance. Further, the majority of the measures allowed for ratings by multiple informants, an approach considered potentially useful in obtaining more accurate outcome assessments (White et al. 2007). The following describes the measures.

Behavior Assessment System for Children Parent Rating Scales (BASC-PRS) and Teacher Rating Scales (BASC-TRS; Reynolds and Kamphaus 1992, 1998)

The BASC is a rating scale that quantifies parent and teacher perceptions of children’s behavior and skills using items rated on a four-point frequency scale ranging from 0 = Never to 3 = Almost Always. For this study, three subscales (Social Skills, Withdrawal, and Atypicality) and two composites (Behavior Symptoms Index [BSI] and Adaptive Skills) were selected to replicate and expand results of the previous study. The Social Skills scale assesses interpersonal skills needed for successful social adaptation and interaction, Withdrawal assesses the tendency to pull back from or avoid social contact with others, and Atypicality assesses behaviors generally considered developmentally immature or odd (e.g., humming to self, rocking). The BSI is broad clinical composite that subsumes scores from the hyperactivity, aggression, anxiety, depression, atypicality, and attention problems scales. The Adaptive Skills composite is an index of general adaptive functioning that subsumes scores from the adaptability, leadership, and social skills scales. Coefficient alphas for the scales and composites used in this study ranged from .73 to .93 for the PRS and .74 to .97 for the TRS. Concurrent validity studies with instruments that measure similar behaviors and skills (e.g., Connors Rating Scales, Child Behavior Checklist) have generally yielded moderate correlations with the BASC (Reynolds and Kamphaus 1992, 1998). While the second edition of the BASC has been published, the original was administered in this study in an attempt to replicate prior findings using the same version of the instrument. Similar to the prior study, treatment staff completed the TRS and parents completed the PRS.

Skillstreaming Survey (Ss)

The adapted Ss is a 38-item survey adapted by the current researchers to measure social skills and social behaviors. The majority of items included in the adapted Ss were developed by the Skillstreaming authors (Goldstein et al. 1997; McGinnis and Goldstein 1997) and are provided as part of the Skillstreaming program as a measure of skills taught in the curriculum. The adapted items were selected from the Skillstreaming curriculum as they assessed skills covered in the program and social behaviors that were reinforced. A total of 38 items were included for the parent survey and 38 items for the staff survey. The skills measured on both forms are identical, however the language was modified to reflect “your child” for the parent rating scale, and “the child” for the staff rating scale. Each item describes a social behavior and raters select the extent to which the child demonstrates or engages in the specific behavior/skill on a 5-point scale (1 = Almost Never; 2 = Seldom; 3 = Sometimes; 4 = Often; 5 = Almost Always). Based on the sample from the current study, the Ss total score yielded a coefficient alpha of .94. This composite score also correlated .72 with the BASC Social Skills score, .62 with the BASC Leadership score, and −.45 with the BASC Withdrawal score.

Diagnostic Analysis of Nonverbal Accuracy2 (DANVA2; Nowicki 1997)

The DANVA2 is a computer-based research instrument that assesses the ability to accurately identify four basic emotions (i.e., happy, sad, angry, and fearful) through facial expressions or spoken language cues. While it includes four subtests (Adult Faces 2, Child Faces 2, Adult Paralanguage 2, and Child Paralanguage 2), only Adult Faces 2 and Child Faces 2 were used in this study. In the Adult Faces 2 and Child Faces 2 subtests the examinee watches facial pictures conveying various emotions on the computer screen. Each picture is presented for two seconds and then the examinee selects whether the person in the picture appeared happy, sad, angry, or fearful. A median coefficient alpha of .73 (range .64 to .90) was reported for Adult Faces 2 across a wide age range (2.8 years to college age). Significant correlations (moderate to high) were reported between the Adult Faces 2 subtest of the DANVA2 and the original DANVA and the Caucasian Facial Expressions of Emotion Test. A modal alpha coefficient of .76 (range .69 to .81) was reported for the Child Faces 2 subtest across the ages of 4 to 16 years. The Child Faces 2 subtest scores correlated moderately and significantly with the same subtest on the original DANVA.

Parent Satisfaction Survey (PSS)

Parent satisfaction was assessed using a researcher-developed 9-item survey. Items 1–6 were based on a five-point Likert scale ranging from 0 = Completely Dissatisfied to 4 = Completely Satisfied. These items assessed parent satisfaction with (1) staff understanding of their child, (2) services received, (3) effective teaching for their child, (4) cooperation of program staff, (5) child progress, and (6) staff communication regarding their child’s progress. Parents were also asked to rate their (7) overall feeling about the program, (8) whether they would recommend the program to another, and (9) their satisfaction with parent training on a five-point Likert scale.

Procedure

Following screening, participants were matched and randomly assigned to one of two performance feedback treatment conditions (see below). Each summer the program was administered for six-weeks, five days per week for six hours each day on a college campus. Classrooms and group rooms on the campus, as well as outdoor space were used for conducting the groups and therapeutic activities. In both conditions the social treatments were delivered in small groups consisting of six children and three staff, and each group had their own classroom or group room. The staff was composed of undergraduate and graduate students from the fields of psychology and education. While children from both performance feedback conditions attended the treatment program concurrently, the children and staff in the two conditions were kept segregated and did not interact with children or staff from the other condition. Because staff was trained separately and the groups were segregated, the staff from each of the treatment conditions was only familiar with their own performance feedback condition and the six children in their assigned group. A total of four treatment cycles were conducted each day, with each cycle beginning with a 20-min structured social skills group and ending with a 50-min therapeutic activity created to practice skills taught in the 20-min groups. The content and activities of the program were identical across the treatment conditions and targeted core deficit areas identified in the diagnostic criteria and literature including social skills, face and emotion recognition, interpretation of non-literal language, and interest expansion.

The curricular content of the social skills program conducted during the 20-min groups was derived from Skillstreaming (Goldstein et al. 1997; McGinnis and Goldstein 1997). Skillstreaming is a program designed to teach social skills to children and adolescents using systematic procedures including teaching, modeling, role-playing, performance feedback and transfer of learning (McGinnis and Goldstein 1997). Each instructional session followed the designated nine-step Skillstreaming procedure including: (1) Define the skill; (2) Model the skill; (3) Establish trainee skill need; (4) Select role-player; (5) Set up the role play; (6) Conduct the role play; (7) Provide performance feedback; (8) Assign skill homework; [and] (9) Select next role-player (Goldstein et al. 1997, p. 37). Skills from the Skillstreaming curriculum were selected if they addressed specific characteristics delineated in the DSM-IV-TR (APA 2000) and related literature, and were taught in a sequence from basic to more advanced. While the majority of the same Skillstreaming skills were taught to all participants, a few of the skills differed based on the age of the children (see Appendix). These skills differences were consistent across both treatment conditions and allowed for the inclusion of some skills that were more reflective of social situations/demands likely to be encountered by children at differing ages.

Instruction in face and emotion recognition and interpretation of non-literal language was also provided during two separate 20-min groups each week. The researcher-developed curriculum in face and emotion recognition provided instruction and practice in recognizing and labeling elements of facial expression that reflected different emotions, the physiological elements associated with the emotions, and recognition of these physiological elements in one’s own body. For interpretation of non-literal language, the researcher-developed curriculum provided instruction and practice in understanding the multiple meanings of language that can exist beyond literal and concrete interpretations. Participants also worked on interpreting idioms.

After each 20-min group, participants then practiced skills during the 50-min therapeutic cooperative activities targeting social skills, face and emotion recognition, interpretation of non-literal language, and interest expansion. Cooperative activities were purposefully designed to require a minimum of two participants to work together to successfully complete the task. An example of a cooperative activity involved a construction task in which two children were required to build an object from craft sticks and glue. During the activity, each child was only allowed to use one hand. Each pair had to agree on an object to build, develop a plan, and build the object together. At the end of the activity the children then presented their object and discussed the social skills they used to successfully complete the task. Therapeutic activities were also designed to improve face and emotion recognition skills by targeting identification of facial expressions and emotions, physiological responses associated with facial expressions, and the link between expression, emotion, and behavior. These activities began with basic understanding and became increasingly challenging over the course of the program. An example of an early task involved a face and emotion “scavenger hunt” in which the children had to locate facial expressions in magazines that represented each emotion from a list of emotions and create a collage. The children then presented their collages and described the facial features and emotions depicted in each of the identified magazine pictures. An example of a more challenging and naturalistic activity that occurred near the end of the program involved the children watching a movie with human actors. The staff randomly paused the movie and had the children describe the facial expression being portrayed by the actor, the emotion associated with that facial expression, and how that emotion would be experienced physiologically. Therapeutic activities that addressed interest expansion worked to extend the children’s interest in and awareness of more diverse topics using activities that required exploration of non-self-selected topics. For example, one activity required two children working cooperatively to randomly select a topic from a hat containing a number of potential topics (the staff ensured that none of the potential topics reflected the children’s restricted area of interest). The two children then collaboratively conducted a computer search on the topic and developed a single book describing the topic. The pairs then stood before the group and together presented their book, what they had learned, and when they might have the opportunity to use this new information in a social situation.

While participants in both treatment conditions received the same core treatment components, two forms of performance feedback were compared. Children randomly assigned to the response-cost (RC) condition received performance feedback based on specific operationally defined social skills and behaviors. Point provision and costing was done immediately following the occurrence of the target behavior or social skills such that participants received a preset number of points for demonstrating a predetermined social skill or behavior, as well as for using any prior social skill taught during the program. Rule violations or demonstration of problematic social behaviors (e.g., poor eye contact, sharing irrelevant information) would result in withdrawal of a preset number of points. Each child in the RC group also had a daily report card (DRC) with three or four target social behaviors unique to that child. The social behaviors selected for the DRC were not chosen from the Skillstreaming curriculum, but rather were based on the unique social and behavioral needs of each child that were not covered in the curriculum. Formal point reviews were conducted every 20 min throughout the day. At the end of each day, each child had the opportunity to earn an edible reinforcer for reaching an individualized goal. Additionally, each child’s points were used to earn a contingent weekly fieldtrip. In contrast, children in the non-categorical (NC) feedback condition received feedback without the use of predetermined operationally defined behavioral categories. Similar to the RC condition, each child in the NC feedback condition had three or four individual skills and behavioral targets beyond those covered in the Skillstreaming curriculum that were informally reviewed at the end of each day. Daily edible snacks and the weekly field trip were non-contingent for children in the NC feedback group.

Pretests were completed by parents just prior to the program and staff provided ratings on the eighth day of the program. Child pretesting was done during the first week of the program. Parent and staff posttest ratings were done during the last two days of the program and children were tested during the last three treatment days. Parent satisfaction surveys were only administered during the first year of the current study due to the large number of rating forms used during the second year. Parent fatigue with the large number of rating forms was a concern and it was decided that other measures of outcome were more necessary.

Treatment Integrity

In order to monitor and ensure treatment integrity, three fidelity procedures were instituted (two during staff training and one throughout the program). First, all program staff was required to pass a written exam assessing their knowledge of the treatment manual including the clinical features of HFASDs, instructional methods, and program procedures with a score of 100%. Second, staff members then received two days of classroom training followed by three days of applied practice. Applied practice days had staff members practicing administering the program with other staff members serving as child actors. During these sessions, research assistants and/or program directors assessed fidelity using a standardized tracking sheet indicating the correct instructional sequence and procedures, scheduling and time requirements, and feedback formats. Each staff member was required to achieve a minimum of 90% fidelity during the practice sessions. Third, these same tracking sheets were used by research assistants to monitor treatment fidelity throughout program implementation for the social skills groups and therapeutic activities. Fidelity was assessed for approximately 15% of all sessions and sessions were selected randomly. Combined treatment fidelity across the two groups for both years of this study was 95.75% for social skills groups and 96% for activities. Comparison of fidelity by treatment condition indicated 96.5% for both social skills and for activities for the NC feedback condition, and 95% for social skills and 95.5% for activities for the RC condition. These data reflect high levels of treatment integrity overall, as well as high levels of comparability across treatment conditions. Evidence of treatment integrity has been identified as crucial in establishing the validity of treatments (White et al. 2007).

Results

Data Analyses

Data for the current study were examined using several statistical procedures. Initially, the groups were compared to determine whether they differed on important demographic characteristics using independent samples t-tests. Outcome measures were then analyzed to determine treatment effects using repeated-measures analysis of variance. Effect size estimates were also calculated using Cohen’s d (Cohen 1988) for mean differences and omega squared (ω 2; Hays 1994) for the bias-corrected proportion of the total variance in the dependent variable related to main and interaction effects.

Demographic Comparisons

To examine treatment condition comparability on major demographic variables expected to result from random assignment, the RC and NC groups were compared on child age, child IQ, and parent education level. Results of independent samples t-tests (alpha = .05, two-tailed) indicated no significant differences between the two treatment conditions on average child age, t(52) = 0.31, p = .760, average child IQ, t(52) = 0.84, p = .406, or average parent education level, t(50) = 0.75, p = .454. Results of these comparisons reflected a high level of comparability across the treatment conditions. This comparability made it unnecessary to utilize age, IQ or parent education as covariates in any of the outcome analyses that follow.

Dependent Measures

BASC Scales and Composites

Statistical results for the major BASC scores used in this study are reported in Table 2. Three specific BASC scales (i.e., Social Skills, Atypicality, and Withdrawal) were examined due to their content reflecting both important aspects of HFASDs and program intervention targets. For the Social Skills scale, significant main effects pre-post were found for both parent ratings, F(1,48) = 9.85, p = .002, and staff ratings, F(1, 51) = 6.22, p = .008, however no interactions were significant. Effect size estimates suggested a small (staff d = −0.24) to medium (parent d = −0.42) treatment effect for social skills. On the Atypicality scale, no significant interaction or main effect was found for parent ratings. However, a significant time by treatment condition interaction was found for staff ratings of Atypicality, F(1, 51) = 4.97, p = .03. Examination of pre-post ratings indicated that the NC group received higher ratings of Atypicality at posttest than pretest (medium effect, d = −0.32), whereas the RC condition received negligible, though slightly lower ratings, pre-post for Atypicality (d = 0.11). For parent rated Withdrawal a significant pre-post main effect was found, F(1, 48) = 7.89, p = .004, reflecting a significant decrease (d = .28) in withdrawn behaviors. A significant time by treatment condition interaction was found for staff rated Withdrawal, F(1, 51) = 5.11, p = .028. Examination of the effect sizes for the two treatment conditions indicated no difference for the NC group (d = −.04) and a medium effect size for the RC group (d = .41) reflecting a decrease in withdrawn behaviors.

Table 2 BASC PRS and TRS pretest and posttest means (and standard deviations), tests of significance and effect size estimates

The BASC Adaptive Skills Composite and Behavioral Symptoms Index (BSI) were also examined as broad, general measures of adaptive functioning and clinically relevant maladaptive behaviors. On the Adaptive Skills Composite, significant main effects were found for both parent ratings, F(1, 48) = 5.72, p = .011, d = −.39, and staff ratings, F(1, 51) = 7.78, p = .004, d = −.29. Effect size estimates reflected a small to medium magnitude increase in general adaptive skills. No significant time by treatment condition interactions were present for either the parent or staff ratings of general adaptive skills. For the BSI, parent ratings indicated a significant pre-post main effect, F(1, 48) = 11.33, p = .001, indicative of a decrease in general problematic behaviors. The effect size d = .32 reflected a small to medium effect for the overall treatment on the parent rated BSI. No significant time by treatment condition interaction was found for parent BSI ratings. In contrast, a significant time by treatment condition interaction was found on the staff rated BSI, F(1, 51) = 9.45, p = .003. Effect size estimates indicated a small to medium effect (d = −.37) for the NC condition indicative of a perceived increase in negative behavioral symptoms over the course of the program. However, there was a small to negligible effect size (d = .14) for the RC condition indicating no pre-post difference or a slight decrease in behavioral symptoms.

Skillstreaming Survey (Ss)

Results of the pre-post Ss indicated significant increases and medium effect size estimates for social skills for both parent ratings F(1, 43) = 20.73, p < .001, d = .54, and staff ratings F(1, 52) = 19.77, p < .001, d = .51. No significant time by treatment condition interactions were found. See Table 3 for more detailed Ss results.

Table 3 Skillstreaming survey (Ss) pretest and posttest means (and standard deviations), tests of significance and effect size estimates

DANVA2

Because the DANVA2 was added in the second year of this study, there are fewer DANVA2 outcome scores (n = 36) compared with the other instruments. Results of the DANVA2 are reported in Table 4. These results indicated no significant main effect for either the Child Faces 2 subtest, F(1, 34) = 2.26, p = .07, d = −.19, or the Adult Faces 2 subtest, F(1, 34) = .97, p = .166, d = −.15. Additionally, no time by treatment condition interactions achieved statistical significance for the DANVA2.

Table 4 DANVA2 pretest and posttest means (and standard deviations), tests of significance and effect size estimates

Parent Satisfaction Survey (PSS)

As previously noted, the PSS was only administered during the first year of this study. Of the 36 children that participated in year one, a total of 34 parents completed and returned the PSS. On four of the nine items, 100% of the parents reported being “completely satisfied” or “satisfied” with staff understanding of their child, services received, effective teaching for their child, and cooperation of program staff with parents. On two items, over 90% reported being “completely satisfied” or “satisfied” with child progress and staff communication regarding their child’s progress. For the item assessing “overall feeling” about the program, 82% reported feeling “very positive” and 18% reported feeling “positive”. Regarding whether parents would recommend the program to another, 82% indicated they would “strongly recommend” and 18% would “recommend” the program to another. Finally, approximately 82% rated the parent training sessions as “very helpful” or “helpful”.

Discussion

As previously noted, the purpose of this study was to replicate and expand results from the first two years of this four year study examining the effectiveness of a manualized summer social program on the social performance of children with HFASDs. The study also assessed feasibility by considering parent satisfaction and treatment integrity, along with general child outcomes. The most consistent finding of the current study was the significant social improvements reported by both parents and staff. The significant improvements in social skills reported on the BASC replicate findings from the earlier study (Lopata et al. 2006) and were further supported by results of the added Ss measure which also indicated significant improvement based on parent and staff ratings. While significant social improvements were found for the overall program, there was no significant interaction to support the relative superiority of one feedback format over the other (RC or NC) based on parent or staff ratings. This is consistent with earlier findings. The lack of a clearly superior method of performance feedback suggested that the high rates of explicit performance feedback and reinforcement in both groups were similarly effective in promoting social skills as measured by the BASC and Ss.

On the BASC subscale assessing parent rated odd and developmentally immature behavior (i.e., Atypicality) there was no statistically significant change pre-post. (However a small effect size (d = .21) was noted for the main effect reflecting a possible decrease that went undetected perhaps due to the greater variation in scores seen on this measure and resulting lower statistical power.) This nonsignificant finding differs from the previous study in that parents in the previous study reported a significant decrease in their children’s atypical behavior. The small effect size indicating a possible decrease in atypical behaviors in this study is consistent however with that of the previous study. In the prior study staff reported a significant main effect increase in atypical behavior and no significant difference between the two forms of performance feedback. In the current study staff reported an increase in atypical behavior for the NC group (small to medium effect) and a relatively consistent level for the RC group. It was suggested in the previous study that staff rated atypicality may rise over the course of the program because the staff are initially less familiar with the children and are exposed to a greater array of each child’s unusual behaviors by the end of the program. It was also suggested that the children may be more inhibited initially at the program, but show more atypical behaviors as they become more comfortable being in the program setting (Lopata et al. 2006). The current finding of an interaction suggests that these can only be considered partial explanations. In the current study, the RC condition was rated at similar levels of atypicality for both pre and post, while the NC group was rated more atypical at posttest. An examination of the means for the two conditions in the previous study indicated that both groups showed more atypicality at the posttest, but that the difference was two points for the RC condition and four points for the NC condition. Taken together, both studies suggest that the RC condition had a greater positive impact on atypicality ratings of the children.

New outcome measures were added beyond those used in the previous study including the Adaptive Skills Composite, Withdrawal scale, and BSI of the BASC, and the DANVA2. Both parent and staff ratings reflected a significant increase in adaptive skills for the program overall, with neither RC nor NC feedback producing a significantly higher score. On the measure of withdrawal, parents reported a significant overall decrease pre-post and neither feedback form was shown to be significantly more effective in reducing withdrawn behavior. For the staff ratings, a significant interaction was found indicating a meaningful decrease in withdrawn behavior for children in the RC group and no substantial change for the NC group. Similar findings were reflected on the BSI which encompasses a number of externalizing and internalizing behaviors. Parents reported a significant decrease in the behavioral symptoms assessed by the scale, with neither form of performance feedback appearing to be more effective. In contrast, staff ratings indicated an increase in behavioral symptoms for the NC group (small to medium effect), whereas children in the RC group had a slight decrease.

On the direct measure of face-emotion recognition (DANVA2), there was no significant change in the children’s ability to identify emotions in adult or child faces. This finding should be considered in light of the fact that the group’s average pretest score was in the average range for both adult faces and child faces. Ozonoff and Miller (1995) cautioned that good performance on a pretest task reduces the amount of possible change evidenced at posttest. Additionally, while staff prompted the children to work on face and emotion recognition during the program, there was only one session per week in which formal face and emotion recognition curriculum was taught. These factors may have contributed to the lack of measured growth in this area.

Overall, the current study replicated findings from the prior study primarily in the area of social skills, with ratings from both parents and staff indicating significant improvement. These findings were the strongest as they not only replicated the previous findings, but also reflected consistency across raters and measures (White et al. 2007). While ratings on the scale measuring atypical behavior were not replicated, the small obtained effect size for the parent ratings was viewed as encouraging for future research. In considering the results of the other measures, parent ratings reflected significant increases in social skills and overall adaptive skills and significant decreases in withdrawal and behavioral symptoms for the program overall. Parent ratings failed to reflect the relative superiority of the RC or NC feedback formats as more effective in improving performance on these measures. The high rates of explicit feedback appeared to have a similar positive outcome based on parent ratings. This finding is consistent with the need for and effectiveness of frequent and explicit feedback (e.g., Bregman et al. 2005; Klin and Volkmar 2000; Safran et al. 2003).

Although staff ratings also supported the effectiveness of the overall program for social and adaptive skills, their ratings on several other scales appeared to be affected by the type of performance feedback. Specifically, the RC format appeared to at least maintain the level of the children’s atypical and behavior symptoms, whereas children in the NC group were rated by staff as having increases in these negative behaviors over the course of the program. Similarly, participation in the RC group was associated with a decrease in withdrawn behavior, whereas ratings of children in the NC group remained primarily unchanged. In contrast to parent ratings which did not reflect significant differences based on performance feedback, staff ratings suggested that the RC feedback offered some potential benefit for some behaviors. The reasons for the differences between parent and staff ratings on a few measures are not known. For example, the discrepancies might have reflected actual context-specific differences in which the children demonstrated different behaviors at home compared to the program. It is also possible that the staff ratings of the more negatively oriented behaviors on which discrepancies were observed (i.e., atypicality, withdrawal, and behavioral symptoms index) were in some way influenced by the limited number of days upon which staff based their pretest ratings (eight days). This was in contrast to the parents who obviously had more extensive observations upon which to base their pretest ratings. While this did not appear to result in parent–staff outcome differences on the more skills-oriented scales (i.e., social skills, Skillstreaming survey, and adaptive behaviors), the children may not have fully exhibited many of the negative behaviors for staff prior to pretest. This could have contributed in some way to the parent–staff outcome discrepancies for the negatively-oriented scales. The speculative nature of these explanations for a few of the scales however indicates a need for more research in this area. In addition, the potential benefits of the RC format requires much closer examination as the current findings are new, restricted to a limited number of scales, and only evidenced in staff ratings. While clearly preliminary, the potential effectiveness of contingent rewards for certain behaviors has been previously identified (e.g., Cragar and Horvath 2003; Thiemann and Goldstein 2004; Wymbs et al. 2005).

The feasibility of the current manualized treatment program was assessed by considering treatment integrity and parent satisfaction, as well as social outcomes. As noted earlier, treatment integrity was assessed using standardized fidelity sheets. The high levels of overall fidelity (95.75% for social skills groups and 96% for activities) over the two summers, as well as comparability of fidelity scores for each treatment condition appears to support the capacity of the program to be implemented in a standardized manner and with a high degree of treatment integrity. The current fidelity ratings also represent an increase over the 87% reported in the original study (Lopata et al. 2006). Parent satisfaction ratings indicated high levels of satisfaction across the nine areas assessed by the survey. Of particular relevance to ongoing feasibility were the items on which 100% of parents reported feeling “positive” or “very positive” about the program and would “recommend” or “strongly recommend” the program to another. The treatment integrity and parent satisfaction ratings, as well as the overall improved ratings on several social outcome measures appeared to support ongoing program feasibility.

In sum, results of the current study are generally consistent with other studies that have used cognitive-behavioral techniques to promote skills and social behaviors (e.g., Barnhill et al. 2002; Cragar and Horvath 2003; Lopata et al. 2006; Ozonoff and Miller 1995; Solomon et al. 2004). The findings also support the contention that social skills interventions should deconstruct complex social behaviors into their component parts, directly and explicitly teach skills in a part-to-whole sequence, and provide specific performance feedback (Howlin et al. 1999; Klin and Volkar 2000; Simpson and Myles 1998). The commonly used techniques of teaching, modeling, and role-playing (Barnhill et al. 2002; Mesibov 1984; Ozonoff and Miller 1995; Solomon et al. 2004) were also found to be effective in the current study. Although Ozonoff and Miller (1995) justifiably cautioned that packaged social skills programs originally developed for non-autism spectrum populations may be inappropriate as they assume certain social-cognitive skills that may be lacking in children with HFASDs, the current findings suggest that programs such as Skillstreaming that explicitly teach task-analyzed complex social behavior may be effective for some children with HFASDs. Factors such as cognitive ability and language; however, may have a significant effect on the applicability of these programs and should be carefully considered.

While there were several positive outcomes associated with the current study, there were a number of limitations that warrant mention. Though the sample was relatively large compared with other social intervention studies, it was none-the-less limited in terms of size and demographic representativeness. This significantly restricts the generalizability of the findings. The study was also limited by the lack of a no-treatment control group. The random assignment used in this study allowed for the comparison of two types of performance feedback; however the lack of a no-treatment control group leaves the study vulnerable to several threats to internal validity (e.g., statistical regression, history, etc.). Though the researchers used an intensive screening procedure to confirm the presence of a HFASD, the screening did not involve a “gold standard” autism diagnostic instrument (i.e., ADOS or ADI-R). Predominant reliance on rating scales for evaluating the outcomes was another limitation as potential rater bias could have affected the results. For example, the parents and staff were all aware that the children were participating in the program which may have influenced their ratings in favor of the treatment. The use of multiple raters; however was considered a strength (White et al. 2007). Additionally, the impact of the treatment on everyday functioning beyond the treatment setting requires further attention. The improvements in parent ratings suggested that the new skills and behaviors were affecting performance beyond the immediate treatment setting, however direct measures of this impact on social proficiency were not collected. Such measures would provide valuable information about the magnitude of change needed to meaningfully change social proficiency, as would some additional follow-up assessments to determine generalization and maintenance of skills. Based on these limitations, future studies should attempt to recruit more diverse and representative participants and confirm the participants’ diagnoses using standardized autism diagnostic instruments. Studies would also be strengthened by randomized designs that include a no-treatment control group, the use of more direct measures of social behaviors (direct observations) in addition to ratings scales, follow-up measures, and measures that assess clinically meaningful changes in everyday functioning.