Introduction

In 2002, Siller and Sigman published the first prospective longitudinal study to show that responsive parental behaviors reliably predict the long-term language outcomes of children with Autism Spectrum Disorder (ASD). Results showed that parents who were more responsive to their children’s attention and activity during initial toy play (chronological age: M = 50.3; SD = 11.7) had children who made larger subsequent gains in language abilities over a period of 10 and 16 years than parents who were less responsive initially. Importantly, these predictive relations could not be explained by initial variation in child characteristics such as mental age, language age, IQ, or joint attention. During recent years, at least four published reports, involving independent prospective longitudinal samples of young children with ASD, have replicated and extended these initial findings (Siller and Sigman 2008; Baker et al. 2010; McDuffie and Yoder 2010; Adamson et al. 2009). Similar predictive relations have also been reported for children born prematurely or with low birth weight (Landry et al. 2006; Landry et al. 2001), children with early developmental delay (Baker et al. 2007), children with Down syndrome (Harris et al. 1996), and children with Fragile X syndrome (Warren et al. 2010). Finally, research on typically developing children has helped specify the boundaries of a specific developmental window (between 9 and 15 months) during which children’s language acquisition is particularly dependent on parental language input that is contingent upon children’s attention and activity (e.g., Akhtar et al. 1991; Carpenter et al. 1998; Smith et al. 1988; Tamis-LeMonda et al. 2001).

Despite this growing evidence base, the efficacy of strategies used to teach or promote responsive parental behaviors have received limited attention (Dunst and Trivette 2009). Informed by principles of adult education (Collins 2004; Trivette et al. 2009), the emphasis of parent education in general has shifted away from a narrow focus on skill attainment (Anderson et al. 1987; Lovaas 1987) and moved towards a more holistic approach that aims to enhance the capacity of families to meet the needs of their children. A family-centered approach is required for early intervention programs funded through Part C (Individuals with Disabilities Education Improvement Act of 2004, IDEIA) and consistent with practice recommendations published by the Division for Early Childhood of the Council for Exceptional Children (DEC; Sandall et al. 2005) and the National Association for the Education of Young Children (NAEYC; Copple and Bredekamp 2009). Based on a review of the literature, Woods and Brown (2011) identified four global strategies to support family capacity building: (1) addressing the families’ informational needs, (2) using their natural environments as the intervention context, (3) engaging parents to be active participants in the intervention process, and (4) supporting the caregivers’ reflection and self-evaluation. The experimental intervention (Focused Playtime Intervention, FPI) evaluated in the current research project aims to promote responsive parental behaviors in the context of a family-centered intervention. Specific FPI strategies to support family capacity building are reviewed in Table 1.

Table 1 Strategies to support family capacity building implemented in Focused Playtime Intervention (FPI)

Although findings from prospective longitudinal research are important, correlational findings do not allow us to draw firm conclusions about the causal link between responsive parental behaviors and children’s subsequent language development. Thus, the current study uses an experimental design where participants are randomly assigned to different treatment conditions. The main goal of this research is to evaluate the effect of FPI on gains in responsive parental communication (i.e., maternal synchronization; Siller and Sigman 2002, 2008) and gains in children’s expressive language abilities. A second goal is to examine two conditional effects of FPI. First, we predicted that baseline classification of maternal insightfulness would moderate the effect of treatment on gains in maternal synchronization. During the last decade, a renewed interest has emerged on research on maternal mental representations, particularly the mothers’ capacity to describe her child’s “thoughts, feelings and behaviors in a rich, nuanced, and accepting way” (i.e., maternal insightfulness; Coyne et al. 2007, p. 486; Oppenheim and Koren-Karie 2002). Several studies involving parents of typically developing infants have shown that insightful mothers show higher levels of sensitivity during play interactions with their infant than non-insightful mothers (Demers et al. 2010; Koren-Karie et al. 2002; Coyne et al. 2007). Even though a similar association has also been reported for parents of children with ASD (Hutman et al. 2009), we predict that parental insightfulness is not sufficient for engaging a young child with autism in responsive interactions. That is, parents also require a set of autism specific interactive tools and strategies. For example, the parents’ ability to interpret the attentional cues of a young child with ASD may only translate into responsive parental communication if the parent also knows how to effectively structure the play environment, manage the child’s repetitive behaviors, and use language to comment on the child’s ongoing engagement with toys. The second conditional effect of FPI investigated in this research is based on the finding that for typically developing infants and toddlers, responsive parental behaviors are particularly important during early stages of language development. Thus, we predict that baseline measures of expressive language moderate the effect of treatment on children’s language outcomes. Assuming treatment effects on both, maternal synchronization and children’s language, the third goal of this research is to explore whether the treatment effect on children’s long-term language outcomes is mediated by short-term gains in maternal synchronization.

Methods

Participants

Seventy children participated in this research. To increase the comparability between research participants, we required that children’s mothers participated in all assessment and intervention sessions. The majority of families (53 %) were referred to the study through one of four local, state-funded regional centers. These regional centers serve as a local resource to help find and access the services and supports available to individuals with developmental disabilities and their families in California. The remaining families learned about this study through other research projects or university clinics (25 %), online research directories (11 %), or word-of-mouth (8 %). Families were eligible to participate if (1) the child was 6 years or younger when entering the study, (2) the child had previously been diagnosed with Autism Spectrum Disorder, (3) the child showed limited or no use of spoken language (generally fewer than 25 words and no phrases based on parent report), (4) the child’s mother was fluent in English and willing/available to participate in all assessment and treatment sessions, and (5) the family lived within a reasonable travel distance from the research lab (generally less than 90 min). As shown in Fig. 1, 104 families participated in at least one baseline assessment session. Based on the results of these initial evaluations, 10 children were found to be ineligible. In addition, 24 families failed to complete all necessary baseline assessments or declined to participate.

Fig. 1
figure 1

Participant recruitment, enrollment, randomization, and retention

Descriptive information on child characteristics and non-project services is presented separately for the experimental and control group (Table 2). The sample included 64 boys and 6 girls. All children met diagnostic criteria for Autistic Disorder on the Autism Diagnostic Interview-Revised (ADI-R; Lord et al. 1994) and 64 children also met diagnostic criteria for Autistic Disorder on the Autism Diagnostic Observation Schedule-Generic (ADOS-G; Lord et al. 2000). Of the remaining 6 children, 5 met criteria for Autism Spectrum Disorder on the ADOS-G, and one child was not administered this measure due to time constraints. A short interview was used to identify known medical conditions. None of the parents indicated the presence of known genetic diagnoses such as Fragile X, Tuberous Sclerosis or Rett Syndrome. However, two children were previously diagnosed with Cerebral Palsy, and 10 children had a history of seizures. Three parents reported that their children were taking medication to control seizures concurrent with the study. Descriptive information on parent and family characteristics is presented separately for the experimental group, the control group, and (as available) was compared to Census data for Los Angeles County (Table 3). The mean age (SD) of children’s mothers in the experimental and control group was 36.0 years (5.3) and 35.7 years (6.1), respectively. Overall, our research sample approximated the diversity of the local community quite well, with the exception that mothers who did not complete high-school were underrepresented.

Table 2 Descriptive information on child characteristics and non-project services reported separately for the experimental and control group
Table 3 Parent and family characteristics comparing the experimental group, the control group, and (as available) census data for Los Angeles County

Overview and Timeline

Data for this randomized clinical trial were collected at a single project site between 2004 and 2007. Three waves of data were collected. Baseline assessments occurred during three individual sessions. Two assessment sessions were held at our research lab and one session was scheduled in the families’ home. In addition to the ADI-R and the ADOS-G, assessments included the Mullen Scales of Early Learning, the Early Social Communication Scale, the Insightfulness Assessment, observations of mother-child interaction, a medical history questionnaire, and a survey of non-project services. For 89 % of the families, all three sessions were held within a period of 2 months; for the remaining 8 families, sessions took place within 3 (n = 3), 4 (n = 2), 5 (n = 2), and 7 (n = 1) months. Once the initial assessments were completed, families were randomly assigned to either the experimental or control condition. To ensure that out of every 4 consecutive children, 2 were assigned to the experimental and 2 were assigned to the control group, children were randomized in clusters of 4 children. This approach retains the positive attributes of random assignment, while equalizing group size, which is useful in terms of preventing cohort effects and managing resources. Throughout the study, staff and students involved in administering assessments or coding observations were kept blind to the participants' group assignments. Prior to all outcome assessment sessions, parents were reminded not to reveal their group assignment to our assessment staff.

Across both treatment conditions, parents were invited to participate in a parent education program that aimed to help parents effectively advocate for their young child with ASD (Parent Advocacy Coaching, PAC). Families assigned to the experimental condition were also invited to participate in Focused Playtime Intervention (FPI). After the last intervention session was completed, families completed a series of exit assessments. Since families required different amounts of time to complete the intervention sessions, the time lag between baseline and exit assessments varied substantially between families, but was well matched between the experimental (M = 147 days, SD = 41, Range: 91–279) and control group (M = 141 days, SD = 43, range: 78–255). Finally, families were invited to participate in a final wave of follow up assessments, scheduled approximately 12 months after exit (M = 13.9 months, SD = 4.7, range: 9–32). Assessments administered at exit and follow up included some, but not all the measures administered at baseline. Information on subjects’ completion of the allocated intervention, measures and attrition is displayed in Fig. 1 (CONSORT Flow Diagram).

Focused Playtime Intervention: Goals and Content

Focused Playtime Intervention (FPI) is a parent education program that involves 12 in-home training sessions (one session per week for 12 weeks, 90 min per session) and follows a standardized treatment manual (the treatment manual is available as an online resource to this manuscript). As described above and summarized in Table 1, FPI uses a capacity building approach to promote coordinated toy play between parent and child, and includes an ordered sequence of eight topics. Information about the goals and content of each topic is provided in Table 4. FPI was delivered by trained graduate and postdoctoral students in developmental psychology and counseling. All intervention sessions were videotaped and at least two sessions per child were chosen at random and coded using a fidelity checklist. The inter-observer reliability of this fidelity checklist was evaluated based on 20 videotaped sessions, revealing excellent agreement between two independent raters (ICC = 0.85). Results from applying this checklist to 77 intervention sessions (at least 2 intervention topics were selected at random for each child) revealed that 88.3 % showed fidelity scores above 80 % (M = 89.6 %; SD = 9.0).

Table 4 Focused Playtime Intervention (FPI): intervention topics and content

Each treatment session consists of two parts. The first part (30–60 min) involves both parent and child and provides ample opportunities for parent and interventionist to take turns interacting with the child. After the intervention team enters the home, parent and child are provided with a suitcase that includes a standard set of toys. Parent and child are invited to remove the toys from the suitcase and play for a period of 10 min. After this initial episode of parent-child interaction, the interventionist joins the dyad on the floor, and provides the parent with a short overview of the sessions’ topic (2–4 min). After this initial introduction of the topic, parent and interventionist take turns interacting with the child for additional 15–45 min. In the context of these interactions, the interventionist demonstrates strategies that relate to the sessions’ topic, provides specific and concise feedback on the parent’s play (accentuating her positive contributions), and comments on the child’s responses. All interactions between parent, child and interventionist are videotaped and captured live using a laptop computer. The second part of each session (30–60 min) involves only the parent (a co-interventionist is available to help with child care). During this time, each intervention topic is elaborated using a range of adult learning strategies, including an illustrated workbook for parents (the workbook is available as an online resource to this manuscript), video feedback, conventional teaching, and review of weekly homework assignments. Particular emphasis is given to video feedback where parent and interventionist review specific moments of the videotapes captured during the first half of the session. The interventionist carefully chooses these moments to illustrate specific activities, adult behaviors or child responses as they relate to the topic of the respective session. In discussing the challenges that a parent may face while engaging her young child with autism in coordinated toy play, the interventionist aims to maintain a collaborative working relationship and engage the parent in active problem solving.

Parent Advocacy Coaching: Goals & Content

Parent Advocacy Coaching (PAC) is a structured education program that aims to promote the parents’ ability to actively participate in the planning of their child’s treatment and educational program. Most families of children with autism in California participate in at least two annual planning meetings; one meeting is scheduled with a representative from the families’ local California Regional Center (i.e., Individual Program Plan Meeting); the second meeting is scheduled with the child’s teacher and/or representative from the child’s the school district (i.e., Individualized Education Program Meeting). Families randomized to the control condition were invited to participate in 4 PAC sessions (one session per month, 90 min per session). Given that the first sessions of PAC and FPI include several shared components (e.g., gathering information on the family and the child’s current intervention program), families in the experimental condition were only invited to participate in 3 PAC sessions. While participating in PAC, parents learned about the structure of the individualized planning process and how to access available resources. They also participated in a structured conversation that aimed to identify developmental needs in the areas of health, daily-living skills, challenging behaviors, social integration, education and family supports. In addition to the detailed report about the results from assessments, parents were provided with a written report summarizing the needs identified during this parent interview.

Measures

Assessments of Non-Verbal Cognitive and Language Abilities

To evaluate nonverbal cognitive and language abilities, children were administered the Mullen Scales of Early Learning (MSEL, Mullen 1995). The MSEL includes four subscales measuring nonverbal cognitive abilities (Visual Reception and Fine Motor Subscale) as well as children’s receptive and expressive language abilities. All subscales provide age equivalent scores for children’s abilities. Even though the MSEL provides norm-referenced T-scores, most children in this study scored outside the range of differentiated scores. For this reason, all reported analysis were based on children’s age equivalent scores.

Insightfulness Assessment

The Insightfulness Assessment (IA: Koren-Karie and Oppenheim 1997; Oppenheim and Koren-Karie 2002) is a semi-structured interview that asks mothers to discuss three previously recorded video vignettes of mother-child interaction. The video footage was obtained at the first laboratory assessment session and the interview was conducted during the home visit. Participants were shown the first 2 min of three interactions, always in the same order: (a) mother and child engaging in free play with scarves; (b) mother and child playing with a standard set of toys; and (c) mother and child cleaning up the play area. After each clip, mothers were asked what the child was thinking and feeling during the preceding interaction; whether the behavior was typical of the child; and whether the clip concerned her, surprised her, or made her happy. Following the vignettes, mothers were asked general questions about characteristics of the child and her relationship with the child. The IA is coded from verbatim interview transcripts. IA transcripts are scored on ten 9-point rating scales, including insight into the child’s motives; flexibility of thought; complexity and richness in description of the child; focus on the interview topic; acceptance; anger; concern; separateness; and coherence of thought. Profiles of scores on the ten scales indicate one of three primary classifications of each interview. Interviews are classified as Positively Insightful, One-Sided, or Disengaged. The latter two categories are considered non-insightful. Based on responses to twenty-three gold-standard transcripts, the second author of this report was certified reliable with the authors of the IA to code other IA transcripts. The second author and a research assistant double coded seventeen IA transcripts, representing 25 % of the current sample. Agreement on primary classification was 82 %, Cohen’s kappa = 0.74. The remainder of the IA transcripts was coded by the second author. The IA transcripts of 3 mothers were not scorable, mainly due to poor audiotape quality. In the experimental and control group, 13 mothers (39.4 %) and 11 mothers (33.3 %) were classified as positively insightful.

Responsive Parental Communication

Each of the three visits (two visits were held in the research lab, one visit was held in the families’ home) at baseline and exit included the videotaping of an episode of mother-child interaction. Mothers were presented with one of two parallel standardized toy sets, instructed “to play as they normally would”, and videotaped with a hand held camera for 10 min. The videographer was instructed to capture an optimal view of (a) the child’s face, (b) the toy the child was playing with, and (c) the mother’s hands. Background noises (e.g., TV, open window, air conditioning) were avoided as much as possible. Two minutes (minutes 3 and 4) of each of the three videotaped interactions were coded with an observational computer system (The Observer Video-Pro, NOLDUS), using the coding system described in detail by Siller and Sigman (2002, 2008). This observational coding system focuses on two behavioral dimensions, which were coded during several passes through the video: (1) maternal verbal behaviors, and (2) children’s toy directed attention. Interactions were coded by a team of 12 undergraduate research assistants who were blind with regards to the research hypotheses, assessment wave, and treatment condition. Two findings from our previous research suggest that interaction samples as short as 2 min can provide a reliable measurement of responsive parental behaviors. First, as described above, our initial study showd that maternal synchronization scores based on two-minute samples significantly predicted chilren’s subsequent 10- and 16-year gains in language abilities (Siller and Sigman 2002). Second, in a subsequent study we applied the same coding system to longer video samples of parent-child interaction (M = 14 min; Siller and Sigman 2008). Unpublished analysis revealed that maternal synchronization scores based on the entire 14-minute samples were reliably predicted by the same measures derived from two-minute segments. The strongest correlations were found for maternal synchronization scores derived from minutes 3 and 4 of the videotaped interaction, r(28) = 0.75, p < 0.01.

Maternal Verbal Behaviors

During the first pass through the video, observers marked the onset of distinct verbal utterances. Once the onset of each utterance was determined, a second coder decided whether each utterance was synchronized with the child’s attention. To make this determination, coders reviewed the one second interval prior to the onset of each indicating behavior. If, during this interval, the child was already gazing at the same toy the mother was about to reference, the maternal indicating behavior was coded as synchronized with the child’s attention. If, on the other hand, the maternal behavior aimed to redirect the child’s attention to a different toy, the behavior was coded as unsynchronized with the child’s attention. Finally, in a third pass through the video, we evaluated the content of each maternal utterance. That is, for each utterance, we determined whether it was synchronized with the child’s actions or not. An utterance was determined to be synchronized with the child’s actions if the mother commented on an action the child was already performing prior to the onset of the utterance (e.g., by describing the child’s action or providing reinforcement). On the other hand, an utterance was determined to be un-synchronized with the child’s actions if the mother verbally suggested an action that was different from the action the child was already performing. For example, if the child was engaged with racing the dump truck on the floor and the mother said, “Can you dump the truck?”, the maternal utterance would be coded as un-synchronized with the child’s actions. On the other hand, if the mother said “Oh boy, this truck is driving fast!” the utterance would be coded as synchronized with the child’s actions. At least 20 % of the videotaped interactions (85–99 videos) were code by two independent observers. To evaluate reliability for coding the onset of maternal verbal behaviors, a tolerance of 2 s was used, and percentage agreement indices were calculated. This approach seemed appropriate since interobserver differences in timing were very small (97 % of agreements were within 0.5 s). Thus, the possibility of chance agreement is negligible. Percentage agreement indices for the onset of maternal verbalizations behaviors ranged between 86 and 91 %. For the determination as to whether maternal behaviors were synchronized with the child’s attention or not, Kappa coefficients showed a mean agreement of 0.77 and ranged between 0.72 and 0.82. Similarly, Kappa coefficients showed a mean agreement of 0.76 (range: 0.73–0.81) for the decision whether maternal utterances were synchronized with the child’s action or not.

Children’s Toy-Directed Attention

This part of the coding system was designed to measure the proportion of observation time children were attending to the target toys. The coding was based on 30 video still frames, chosen at random from each 2 min clip. For each still frame, coders determined whether the child was looking at one of the target toys or not. Based on this random sample of 30 events, we estimated the percentage of children’s toy-directed attention. To establish interobserver agreement, 20 % of the videotaped interactions (85 videos) were code by two independent observers. Intraclass correlation coefficients were calculated to evaluate the reliability of the percentage of children’s toy directed attention, revealing excellent agreement, ICC: M (range) = 0.85 (0.81–0.91).

Measure of Maternal Synchronization

Consistent with Siller and Sigman (2002, 2008), the final measure of maternal synchronization included in this analysis was computed as the percentage of verbal behaviors that were synchronized with both, children’s attention and action, divided by the percentage of time children attended to toys. Adjusting the percentage of synchronized maternal utterances by children’s toy directed attention is necessary to control for the mothers’ opportunity to act in synchrony. According to our definitions, mothers only have the opportunity to act in synchrony at times during which the child is attending to one of the target toys.

Response to Bids for Joint Attention (RJA)

Children’s responsiveness to others’ bids for joint attention was evaluated during each of the two lab visits. During each lab visit we administered four kinds of probes. (1) Response to name: This was evaluated during the warm-up period of each assessment session. The child was provided with a set of toys, which were laid out on the floor (e.g., a colorful play mat, large colored blocks, music toys). Once the child was comfortable, the examiner positioned herself at a 90 degree angle to the child and called the child’s name (3 trials). The remaining RJA probes were administered in the context of the Early Social Communication Scale (ESCS, Seibert et al. 1982). In this procedure the child and examiner sat facing each other at a small table. A set of toys was in view but out of reach to the child. (2) Response to a head turn: After eliciting eye contact from the child, the examiner called the child’s name while turning his head/gaze towards posters displayed to the left, right, and behind the child (3 trials). (3) Response to a head turn with pointing gesture: After eliciting eye contact from the child, the examiner called the child’s name while turning his head/gaze and pointing towards posters displayed to the left, right, and behind the child (3 trials). (4) Response to pointing during book reading: While looking at a picture book with the child, the examiner pointed to pictures and called the child’s name (9 trials). All probes were videotaped and coded to determine children’s responses during each trial. For each kind of probe (across both assessment sessions), we calculated the percentage of instances where the child correctly responded to the examiner’s bid for attention. The final measure of RJA was the average percentage of successful responses across all four kinds of probes. Inter-observer reliability was evaluated based on more than 70 assessment sessions (above 25 %). Across the different probes, intra-class correlation coefficients ranged from ICC = 0.85 to ICC = 0.93, demonstrating excellent agreement between two independent observers.

Survey of Non-Project Services

At baseline, parents were interviewed about services their child had received during the preceding 12-month period, using a structured questionnaire developed by Bono et al. (2004). As part of this interview, parents were asked whether children received a range of specialized services for children with ASD (e.g., occupational therapy, speech therapy, applied behavior analysis/ABA, floortime/DIR, social groups) or participated in an educational program (center based early intervention program, preschool, kindergarten, elementary school). If the parents indicated that the child received such services, we also inquired about the time period during which each service was received, the intensity of the service (number of hours per week), whether the service was delivered individually or in a group setting, and whether it was delivered at home or school. The interviews were re-administered after the intervention was completed as well as 12 months thereafter. Data collected during these interviews were entered into a database, programmed in Microsoft® Access. Using this database, we extracted summary information for three time windows: (1) the 12-month period prior to the beginning of the intervention, (2) the time period between the beginning and end of the intervention, and (3) the time window between the end of the intervention and the 12 month follow up assessments. For each time window, we computed (1) the average number of hours per week during which the child received specialized autism services that were delivered individually; and (2) the average number of hours per week during which the child attended an educational program.

Medical History Survey

As part of the 12-month follow up assessments, parents were administered an interview concerning select aspects of children’s medical history. The survey included questions about a range of known medical conditions (e.g., Fragile X, Tuberous Sclerosis, Rett Syndrome, Hydrocephalus, Cerebral Palsy). In addition, the survey inquired about seizures, abnormal MRI or CT scans of the brain, meningitis/encephalitis and head injuries.

Data Analysis

Intent-to-Treat Approach

Analyses were performed on the intent-to-treat basis. Prior to performing the key analyses, we used multiple imputation to deal with the missing data. Briefly, multiple imputation uses a regression-based procedure to generate multiple copies of the data set, each of which contains different estimates of the missing values (Enders 2010). We used the data augmentation algorithm in the SAS MI procedure to generate 100 imputed data sets (Graham et al. 2007, recommend at least 20 for most situations). The imputation process included all variables that appeared in one or more of the subsequent regression analyses as well as seven auxiliary variables (see below). The methodological literature currently recommends an inclusive analysis strategy that incorporates auxiliary variables into the missing data handling procedures because this approach can make the missing at random assumption more plausible and can improve statistical power (Collins et al. 2001).

To identify auxiliary variables that correlate with missingness, two types of missing data were distinguished: (a) missing data that resulted from participant attrition (participants who dropped from the study, either before the exit or before the follow up assessments, n = 8); (b) missing data that were present in participants who did not drop from the study (sporadically missing data, n = 13). Mean comparisons revealed that participants who dropped from the study took longer to complete the intervention period and received fewer autism specific non-project services in the community, on average. Similarly, mean comparisons revealed that participants with sporadically missing data tended to have lower joint attention skills and were more likely to have mothers who were born within the US. To correct for any systematic bias that might be related to these differences, all four variables were used as auxiliary variables in the missing data handling procedures. We also used baseline measures of receptive language, chronological age and annual household income (log scale) as auxiliary variables because of their correlation with incomplete outcome measures. After creating the complete data sets, we estimated the multiple regression models on each filled-in data set and subsequently used SAS MIANALYZE to combine the parameter estimates and standard errors into a single set of results. Note that methodologists currently regard multiple imputation as a “state of the art” missing data technique because it improves the accuracy and the power of the analysis relative to other missing data handling methods (Schaefer and Graham 2002).

Primary Hypothesis Testing Approach

The main goal of this analysis was to evaluate the effect of FPI on gains in maternal synchronization from baseline to exit, and on gains in children’s expressive language from baseline to follow up. Consistent with recommendations for clinical trials (Fitzmaurice et al. 2004; Carter et al. 2011), gains in maternal synchronization and children’s language were quantified as residual gain scores. One advantage of this approach is that it can provide considerably more power to detect treatment effects than other statistical methods (see NICHD ECCRN and Duncan 2003 for a comparison of different approaches). Residual gain scores were obtained by regressing the Time 1 measure of each variable onto the later measure of the same variable. The residual errors for each subject were then used as the criterion scores quantifying change. In the context of the current study, residual gain scores answer whether a participant randomized to FPI is expected to change more than a participant in the control condition, given that they have the same initial value. Linear regression analysis revealed that baseline language scores reliably predicted children’s language scores at follow up, B = 0.89, SE B = 0.08, t(61) = 11.7, p < 0.001. Sixty-nine percent of variability in children’s expressive language at follow up can be accounted for by baseline variation in that variable. Baseline scores in maternal synchronization did not reliably predict the same scores at exit, B = 0.17, SE B = 0.16, t(43) = 1.1, ns.

Results

Preliminary Analyses

Prior to evaluating the primary hypotheses, potentially confounding variables were examined. To check that the experimental and control groups were not different at baseline, independent-samples t tests for continuous variables (e.g., nonverbal mental age) and Chi-square tests for categorical variables (e.g., insightfulness classification) were performed as appropriate. Skewed variables were transformed throughout, usually by taking logs. Measures considered for this analysis included baseline measures of primary outcome variables (e.g., maternal synchronization, expressive language), putative moderators (e.g., maternal insightfulness classification), socioeconomic characteristics (e.g., family income, ethnicity/race, number of siblings, birth father living with family, home owned/rented, mother born within the US) and baseline variables potentially associated with outcomes (e.g., maternal age, education and employment, non-project interventions & programs, children’s chronological ages, Mullen Scores, ADOS scores, Response to Joint Attention scores). Results from this analysis revealed no significant differences between the experimental and control groups on any of the evaluated measures (p > 0.15).

Evaluating Treatment Effects on Maternal Synchronization

To test the main effect of treatment group allocation on maternal synchronization, we specified a series of multiple regression models using SAS PROC REG. All models included children’s chronological age at baseline as a covariate as well as a main effect for treatment group assignment. Results revealed a significant main effect of treatment group allocation on gains in maternal synchronization from Time 1 to Time 2, t(56) = 2.25, p < 0.05. Detailed results from the regression analysis are reported in Table 5.

Table 5 Means and standard errors for maternal synchronization and language (raw and residual gain scores)

To evaluate whether the treatment effect on gain in maternal synchronization is moderated by baseline measures of maternal insightfulness, we added to the main effects of chronological age and treatment group a main effect of maternal insightfulness as well as a group-by-insightfulness interaction term to our regression model. Results showed that baseline classifications of maternal insightfulness moderated treatment effects on residual gain scores in maternal synchronization from T1 to T2, t(51) = 2.12, p < 0.05. Detailed results from the regression analysis are reported in Table 6 and presented graphically in Fig. 2a. Follow up analysis revealed a significant treatment effect for mothers who were classified as insightful at baseline, t(58) = 3.1, p < 0.01, but not mothers who were classified as non-insightful, t(51) = 0.56, p = 0.58. For mothers classified as insightful, effect size estimates revealed a moderate to large treatment effect, f 2 (range) = 0.31 (0.24–0.38). Finally, parameter estimates for insightful mothers assigned to the experimental group were significantly larger than zero, t(57) = 2.1, p < 0.05, indicating a significant increase in maternal synchronization between T1 and T2. In contrast, effect estimate for insightful mothers assigned to the control group were significantly smaller than zero, t(57) = −2.5, p < 0.05, indicating a significant decrease in maternal synchronization over time.

Table 6 Regression analysis predicting gains in maternal synchronization and expressive language from treatment status and maternal insightfulness and baseline language, respectively
Fig. 2
figure 2

Interaction plots: a group-by-insightfulness interaction predicting residual gains in maternal synchronization between baseline and exit, b group-by-baseline language interaction plot predicting gains in expressive language between baseline and 12 month follow up

Evaluating Treatment Effects on Expressive Language

To test the main effect of treatment group assignment on children’s expressive language ability, we specified a series of multiple regression models using SAS PROC REG. All models included children’s chronological age at baseline as a covariate as well as a main effect for treatment group allocation. The main effect for treatment group allocation on change in expressive language from Time 1 to Time 3 was not significant, t(57) = 1.21, p = 0.23. Detailed results are reported in Table 5.

To evaluate whether the treatment effect on gain in children’s expressive language is moderated by baseline measures of expressive language, we added a main effect of baseline language as well as a group-by-baseline language interaction effect to our regression model. As recommended by Aiken and West (1991), the interaction terms were created by grand-mean centering the Time 1 moderator variable and multiplying it with the dummy coded treatment group variable. Significant interaction terms were interpreted using a ‘regions of significance’ approach, which identifies specific values of the moderator variable at which the experimental and control groups show significant differences in outcome (Preacher et al. 2006; Breitborde et al. 2010). As shown in Table 6, baseline measures of expressive language moderated treatment effects on residual gain scores in expressive language from T1 to T3, t(57) = −2.47, p < 0.05. In predicting residual gains in expressive language, only lower regions of significance were interpretable. That is, children with baseline expressive language abilities below 11.3 months showed larger gains in expressive language when randomized to the experimental than the control condition (see Fig. 2b). The current sample included 24 children with expressive language skills below 11.3 months. For these 24 children, effect size estimates indicated a medium to large treatment effect, f 2 (range) = 0.25 (0.09–0.36).

Exploring the Association Between Maternal Synchronization and Expressive Language

The final set of analyses evaluated the association between short-term gains in parent behaviors (residual gains in maternal synchronization between T1 and T2) and long-term gains in children’s expressive language (residual gains in expressive language between T1 and T3). The purpose of this analysis was to explore whether a more comprehensive mediation analysis (MacKinnon 2008) was indicated. As a first step, we used multiple regression to evaluate whether children’s long-term language gains were predicted by short-term gains in maternal synchronization. The corresponding multiple regression model was specified with residual gains in expressive language between T1 and T3 as the dependent variable and residual gains of maternal synchronization as the independent variable. Results revealed no significant relation between short-term gains in maternal synchronization and long-term gains in expressive language, t(48) = −0.57, p = 0.57.

Previous analyses revealed that treatment effects on gains in maternal synchronization were moderated by baseline classifications in maternal insightfulness. Thus, we used multiple regression to evaluate whether children’s long-term language gains were predicted by short-term gains in maternal synchronization, and whether this relation was moderated by baseline classifications of maternal insightfulness. The corresponding multiple regression model was specified with residual gains in expressive language between T1 and T3 as the dependent variable. Independent variables included residual gains of maternal synchronization, baseline insightfulness classifications, and the synchronization-by-insightfulness interaction effect. Results revealed no evidence in support of the hypothesis that initial insightfulness classifications moderate the link between short-term gains in maternal synchronization and long-term gains in expressive language, t(50) = 1.2, p = 0.22.

Previous analyses also revealed that treatment effects on gains in expressive language were moderated by baseline measures of expressive language. Thus, we used multiple regression to evaluate whether children’s long-term language gains were predicted by short-term gains in maternal synchronization, and whether this relation was moderated by baseline measures of expressive language. The corresponding multiple regression model was specified with residual gains in expressive language between T1 and T3 as the dependent variable. Independent variables included residual gains of maternal synchronization, baseline measures of expressive language, and the maternal synchronization-by-baseline language interaction effect. Results revealed no evidence in support of the hypothesis that initial levels of expressive language moderate the link between short-term gains in maternal synchronization and long-term gains in expressive language, t(55) = −1.2, p = 0.23.

Discussion

Converging evidence from several recent longitudinal studies suggests that responsive parental behaviors reliably predict subsequent language gains in young children with ASD. To investigate the causal mechanisms that underlie this prediction, we conducted a clinical trial to evaluate the efficacy of an experimental intervention that aims to enhance responsive parental behaviors in the context of parent-child play interactions (Focused Playtime Intervention, FPI). This research had two major findings. First, results showed that parents who were randomly assigned to the experimental condition showed larger gains in responsive behaviors than parents who were assigned to the control condition. Interestingly, this treatment effect was moderated by baseline measures of maternal insightfulness, indicating that only parents who were classified as insightful evidenced a significant benefit from participating in the experimental intervention. Second, findings revealed a conditional effect of FPI on children’s expressive language outcomes. That is, for children who entered the study with expressive language skills below 12 months (n = 24), results revealed a significant, medium to large treatment effect that accounted for approximately 25 % in the variance of children’s subsequent language gains. A comparable treatment effect was not found for children who entered the study with more advanced language skills.

Treatment Effects on Responsive Parental Communication

Several recent clinical trials have demonstrated that parents of young children with ASD can be effectively taught to implement a variety of interactive strategies and intervention techniques. Strategies that were effectively taught to parents include responsive communication (Green et al. 2010, 2011) and strategies to support joint engagement between parent and child (Kasari et al. 2010). In the current study, responsive parental communication was evaluated using an observational measure that has previously been shown to predict subsequent language gains in children with ASD (i.e., maternal synchronization; Siller and Sigman 2002, 2008). To increase the ecological validity of our observations before and after the intervention, samples of parent-child interaction were videotaped on three separate days, both in the research lab and the families’ home, using two parallel sets of toys. Across all 70 participants, treatment group allocation accounted for approximately 8 % of the variance in maternal synchrony gains between T1 and T2, f 2 (range) = 0.08 (0.02–0.17). By convention, this effect size is considered to be in the small to medium range (Cohen 1988), which is consistent with findings reported by Green et al. (2010) and Carter et al. (2011).

Despite this overall treatment effect on responsive parental behaviors, not all parents seemed to benefit from FPI in the same way. Specifically, parents who were classified as insightful at baseline showed a significant, moderate to large treatment effect in maternal synchronization, f 2 (range) = 0.31 (0.24–0.38). In contrast, this treatment effect was not significant for parents who were classified as non-insightful at baseline. The fact that FPI failed to increase responsive communication in mothers who were initially classified as non-insightful may be attributed to individual differences in maternal learning styles. As emphasized above, FPI uses a capacity-building approach that aims to engage parents to be active participants in the intervention process. Even though this approach is informed by general principles of adult education (Collins 2004; Trivette and Dunst 2009), it is possible that some parents would benefit from a stronger emphasis on skill attainment. In addition, it is worth pointing out that insightful parents assigned to the control condition showed a significant decrease in responsive parental behavior over time. Even though this decrease was not initially predicted, this finding is intriguing because it may indicate that unless responsive parental behaviors are cultivated and encouraged by the child’s intervention team, even insightful parents may shift to a more adult-directed interactive style. For example, one mother whose child was also enrolled in an intense behavioral intervention program described that the experimental intervention helped her remember that her main role is not to be an interventionist but rather a responsive communication partner and parent. “When all of this [the child’s intense behavioral intervention program] began, I felt like I was giving up. It was like, ‘well, here is my son, and now you are going to help him’. With you guys [FPI] coming in, it reminded me that I am active here and I can make a big difference, and it’s not about ‘do-this, do-this, do-this and here is a puff [reward]’. It is not all about that and it does not have to be all about that. So it was nice to have those tools and that reminder ‘okay mom, you can’t leave it up to them; it has to be through you too.”

Treatment Effects on Children’s Expressive Language

The eligibility criteria of this research were in large parts based on developmental theory, which suggests that responsive parental communication is particularly important during early stages of language development (9–15 months; Akhtar et al. 1991; Carpenter et al. 1998; Smith et al. 1988; Tamis-LeMonda et al. 2001). Despite this focus on early language development, our final sample included 18 children with expressive language skills above 20 months. Including some children with more advanced language skills was necessary to ensure a steady flow of research participants throughout the study period. At the same time, our decision to broaden our research sample beyond children who were entirely nonverbal was prompted by earlier findings suggesting that children with milder ASD symptoms (Smith et al. 2000) and higher scores on intelligence tests (Sheinkopf and Siegel 1998) are generally more likely to benefit from treatment. Given our somewhat heterogeneous research sample, results failed to show a significant main effect of treatment group allocation on subsequent language outcomes. However, findings revealed a conditional effect of FPI on children’s expressive language outcomes. That is, for children who entered the study with expressive language skills below 12 months (n = 24), results showed a significant, medium to large treatment effect that accounted for approximately 25 % in the variance of children’s subsequent language gains.

This pattern of results raises several important issues. First, several recent clinical trials evaluating the efficacy of parent-mediated interventions for young children with ASD failed to show main effects of treatment group allocation on distal outcomes such as language skills or symptom severity (i.e., ADOS scores; Green et al. 2010; Carter et al. 2011). The absence of significant main effects on distal outcomes raises questions about treatment intensity. Research on parent-mediated interventions in ASD has paid insufficient attention to promoting the parents’ commitment and ensuring that intervention strategies are implemented with sufficient intensity to produce long-term gains in language or decreases in symptom severity. Second, researchers and clinicians have long recognized the heterogeneity in the clinical presentation of ASD, suggesting that any specific treatment may lead to beneficial outcomes in some children but not others. Thus, findings of conditional treatment effects are more likely to be the norm than the exception. For example, Carter et al. (2011) reported that Hanen’s ‘More Than Words’ intervention was facilitative of communication for children with low levels of object interest, but not children with high object interest (measured as the number of toys children played with in a differentiated, or functional, manner). Similarly, Yoder and Stone (2006) reported that children with low object interest acquired superior communication skills during a responsivity-based treatment relative to a contrast treatment. The current study contributes to this emerging body of research suggesting that interventions aiming to increase responsive parental behaviors may be particularly effective during early stages of development. Even though this finding is consistent with developmental theory, it runs counter to the notion that higher-functioning children are generally more likely to benefit from treatment than lower-functioning children with ASD.

For the 24 children with expressive language skills below 12 months at baseline, the current study revealed that treatment allocation accounted for about 25 % of the variance in children’s subsequent language gains, indicating medium to large effect sizes, f 2 (range) = 0.25 (0.09–0.36). On average, children in the control group who entered the study with expressive language skills below 12 months gained 3.5 months of expressive language skills between T1 and T3. Since the variance in children’s gain scores was 11.1 months, children in the experimental group gained an additional 2.8 months (25 % of 11.1 months), on average. Similar effect sizes have been reported in descriptive longitudinal research. For example, the analyses reported by Siller and Sigman (2008) reveal that the predictive relation between responsive parental behaviors and children’s subsequent language gains evidences a medium effect size, f 2 (range) = 0.15 (0.08–0.22). Thus, even though in a statistical sense (relative to the sample variance) effect sizes can be interpreted as medium to large, it is important to emphasize that 3 months of additional language gain will not allow children to close the developmental gap that separates them from their typically developing peers. On one hand, this research is encouraging for treatment researchers because it demonstrates that statistically significant treatment effects with medium effect sizes can be found in samples of low functioning children with ASD. On the other hand, this finding is a reminder that clinical significance needs to be interpreted in the context of children’s global developmental delays and symptom severity.

To inform public policy, treatment research needs to identify moderators and mediators of treatment gains in children with ASD (Lord and Bishop 2010; Rogers and Vismara 2008). Moderators allow us to predict who a given intervention may be appropriate for. In the current study, only parents classified as insightful at baseline (36 %) effectively changed their communication in response to the experimental intervention. Similarly, only children with expressive language skills below 12 months (34 %) evidenced reliable treatment effects on language outcomes. Mediators, on the other hand, inform us about how a given intervention causes its subsequent gains in child development. Recent advances in statistical methods for identifying mediators in treatment research have produced novel approaches to data analysis that are both more powerful and accessible to researchers compared to conventional methods. For example, Fritz and MacKinnon (2007) demonstrated that, using those modern methods, a sample size of 74 participants provides sufficient statistical power to detect mediated effects, assuming medium effect sizes for both the treatment effect on the mediator and the outcome. This said, detecting mediators in a treatment study that also evidences conditional treatment effects increases the samples size requirements exponentially. For example, in a parent mediated intervention that changes parent behaviors in only some parents but not others, mediated effects can only be detected in the subgroup of parents for whom treatment effects were demonstrated (i.e., moderated mediation). On the other hand, if only a certain subgroup is likely to benefit from changes in parental behaviors, only those children will provide information in detecting mediated effects (i.e., mediated moderation). The current study did not provide any evidence to suggest that children’s long-term language gains can be attributed to short-term gains in responsive parental communication. Given the complex pattern of conditional treatment effects identified in the current study, non-significant findings may not indicate a flawed conceptual theory, but rather a lack of statistical power.

Our ability to draw firm conclusions about underlying causal mechanisms faces two important limitations in our research design. First, the intensity of services differed systematically between the experimental and control condition (fifteen sessions for FPI vs. four sessions for PAC). Thus, increased clinical attention alone could possibly contribute to the treatment effects identified in this research. Second, since the current study failed to demonstrate a significant main effect of treatment group allocation on children’s language outcomes, conditional treatment effects need to be interpreted with some caution. The crucial advantage of a randomized research design is that experimental and control groups are equivalent with regards to a broad range of child and family characteristics. Since moderating variables are not the result of random assignment, we must consider the possibility of spurious associations. For example, children with more or less advanced baseline language skills may also differ in terms of other clinical specifiers of ASD (e.g., severity of deficits in social communication or the presence of repetitive behaviors) as well as associated features (e.g., known genetic disorders, epilepsy, and intellectual disability). Similarly, children whose parents are classified as either insightful or non-insightful may differ with regards to a variety of demographic or child characteristics. Thus, even when putative moderators are being considered (i.e., moderators that were identified prior to the research and have a strong rationale), firm conclusions about the underlying causal mechanisms cannot be drawn. To demonstrate that FPI effectively increases responsive parental communication and that such increases are associated with subsequent gains in children’s long-term language outcomes, future research should specifically target subgroups of children and/or subgroups of parents who are most likely to benefit from treatment. Such a narrowly defined research sample would make it more feasible to enroll a sample providing sufficient statistical power to detect mediated effects.