Introduction

Individuals with autism spectrum disorders (ASD) are highly heterogeneous in their language abilities (Tager-Flusberg et al. 2005). Achievement of “useful speech,” or expressive language that may be used frequently, communicatively, referentially, and in a semantically diverse manner, by the end of the preschool years predicts later social and vocational success for individuals with ASD (Billstedt et al. 2005; DeMyer et al. 1973; Howlin et al. 2000; Kobayashi et al. 1992; Venter et al. 1992). Receptive language in the preschool years has also been linked with long-term outcomes of adolescents and young adults with autism (Howlin et al. 2004; Venter et al. 1992). The aforementioned findings have motivated theories and research seeking to explain variation in expressive and receptive language development of preschoolers with ASD.

Theoretical and Empirical Support for the Putative Predictors of Spoken Language in ASD

Using the extant literature, we have identified nine putative predictors of language growth in preschoolers with ASD who were minimally verbal at entry to the study. See Table 1 for a comprehensive list of these predictors, along with citations of example studies documenting that each predictor has predicted later language in at least two studies on children with ASD. However, no previous study examined all of the putative predictors in the same study. Later, we explain why doing so is important.

Table 1 Putative predictors of language and their empirical and theoretical support

Table 1 also includes labels for theories that lead to examining the nine putative predictors. The theories discussed here are by no means exhaustive or mutually exclusive. To ease communication, we categorize the theories discussed as: (a) child-focused (i.e., viewing variation in expressive and receptive language as secondary to variation in other child factors), or (b) transactional (i.e., viewing variation in expressive and receptive language as secondary to variation in the bidirectional exchange that occurs between the child and others).

Among the child-focused theories, variation in language in children with ASD is thought to be secondary to variation in cognitive, social, and/or motor abilities. Within the cognitively-oriented theories, potential sources for language variation in children with ASD include child-level variation in: (a) the ability to coordinate attention between a referent of communication and the communication partner, (b) object knowledge (a source of communication topics), and (c) symbolic or representational abilities (Chawarska et al. 2012; Clifford and Dissanayake 2009; Rodman et al. 2010; Thiemann-Borque et al. 2012; Walton and Ingersoll 2013). Within the socially-oriented theories, variation in social motivation, attention to others, and theory of mind have been cited as sources of variation in language of children with ASD (e.g., Camaioni 1997; Dawson et al. 1998; Ingersoll and Meyer 2011; Klin et al. 2003; Mundy and Newell 2007; Paul et al. 2007; Senju 2013; Tomasello et al. 2005; Wetherby and Prutting 1984). Within the motor-oriented theories, variations in motor planning and motor execution have been noted as possible sources of variation in expressive language in preschoolers with ASD (Gernsbacher et al. 2008; Belmonte et al. 2013).

Transactional theories consider parent- and child-level sources of child language variation, including variation in: (a) the child’s foci of attention (i.e., the communication partner versus other non-social aspects of the environment) and (b) the parent’s linguistic input (i.e., the amount of processable linguistic input that is provided to the child), and (c) the child’s ability to match his or her productions to various aspects of the parent’s linguistic input (Adamson et al. 2001; Hudry et al. 2013; McDuffie and Yoder 2010; Siller and Sigman 2002, 2008; Shriberg et al. 2011).

The Importance of Identifying Value-Added Predictors

The conceptually overlapping theories suggest that the putative predictors will be associated with each other. Indeed, prior work has revealed noteworthy intercorrelations among many of the aforementioned predictors of language development in children with ASD (Toth et al. 2006; Luyster et al. 2008; Wetherby et al. 2007). Our long-term goal of increasing the proportion of children with ASD who can speak is not best served by simply amassing a list of predictors of language. Science seeks parsimonious explanations for variation in language, and clinicians seek to focus on the most impactful goals to enable spoken language. Although correlational studies cannot definitely identify causes of variation in language, one method for increasing the probability that predictors of language actually influence language is to rule out alternative explanations for the associations by testing whether each putative predictor explains later variation in language after controlling for the intercorrelation among predictors. Such predictors have value added.

Rationale for Considering Additional Background Variables in Models of Spoken Language Growth in ASD

No correlational study can control all covarying variables that might explain why particular theoretically-motivated predictors have empirical associations with language. Nonetheless, it is important to rule out the most obvious global variables that could do so. Severity of autism symptomatology and level of cognitive impairment are among the most salient global child variables that could account for empirical associations among our theoretically-motivated predictors and language growth (Bopp et al. 2009). Thus, these background variables need to be controlled when considering whether more theoretically-motivated predictors account for language growth in our sample.

Purpose and Research Questions

In this study, we seek to identify the predictors that continue to account for significant variance in expressive and receptive language growth of our sample after controlling for intercorrelation among other predictors in minimally verbal preschoolers with ASD (i.e., value-added predictors). Within our set of putative predictors, we distinguish between the theoretically-motivated putative predictors and the more global background variables because we consider the latter variables as less informative for scientific and clinical purposes. We seek to identify value-added predictors in initially nonverbal/minimally verbal children because doing so sheds light on possible obstacles to language acquisition in children with ASD. Value-added predictors during this period may be different from those identified later. For expressive language, our outcome measure is growth in expressive vocabulary and communicative word use. This outcome was selected for its sensitivity to early growth in expressive language in children with ASD, as well as its ability to index development of “useful speech” (i.e., speech that is frequent, communicative, referential, and semantically diverse). Our dependent variable indexing receptive language development in this population is growth in receptive vocabulary.

Our research questions are:

  1. 1.

    Out of nine theoretically-motivated putative predictors and two background variables, which have added value in predicting expressive growth? The theoretically-motivated putative predictors of later expressive growth were early receptive vocabulary, object play, responding to joint attention, intentional communication, attention during child-directed speech (ACDS), motor imitation, nonimitative oral motor functioning, consonant inventory, and parental linguistic responses to child leads. The two background variables were cognitive impairment and autism symptomotology.

  2. 2.

    Out of seven putative predictors and two background variables, which have added value in predicting receptive growth? The seven putative early predictors of later receptive growth are the same as those for expressive growth, with the exception of non-imitative oral motor functioning and consonant inventory. These two variables were not expected to be predictors of receptive vocabulary growth. The same two background variables were examined as predictors of receptive vocabulary growth.

Methods

Participants

Participants included 87 children (71 male and 16 female) who at entry to the study: (a) were between 24 and 48 months chronological age; (b) had a clinical diagnosis of ASD based on criteria in the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition-Text Revision (American Psychiatric Association 2000) and confirmed by the Autism Diagnostic Observation Schedule (ADOS; Lord et al. 1999; Gotham et al. 2007); (c) were reported to say no more than 20 different words according to parent report on the MacArthur–Bates Communicative Development Inventories: Words and Gestures checklist (Fenson et al. 2007); and (d) produced no more than five different word roots during a 15-min language sample. We additionally excluded children with severe sensory or motor impairments, identified metabolic or progressive neurological disorders, and identified genetic syndromes.

Parents reported 4 participants to be Hispanic and 83 to be nonHispanic. For race, parents reported 65 children to be White, 16 to be Black/African American, 5 to be Asian, and 1 to be American Indian or Alaska Native. The formal educational levels of the primary caregivers were distributed as follows: 4 had less than a high school education, 19 had a high school diploma or equivalent, 21 had 1–2 years of college or technical school education, 28 had 3–4 years of college or technical school education, and 15 had attended graduate or professional school. Additional descriptive information on participants is provided in Table 2.

Table 2 Description of Participants

Design

This study used a longitudinal correlational design comprising five measurement points, separated by approximately 4 months each. Nine putative predictors of expressive language development and seven putative predictors of receptive vocabulary development were measured at Time 1 or Time 2, with a resulting 12–16 months interval between measurement of the predictors and the end of the study period for each participant. Two background variables were also measured at Time 1 and used as covariates: child cognitive level and child autism symptomotology.

Measures

The measures for this study included the parent questionnaires, standardized assessments, and behavior sampling procedures listed below. The measurement periods at which procedures were administered are provided in Table 3.

Table 3 Predictor constructs, procedures, untransformed component variable, and analyzed variable

ADOS

The ADOS was used at Time 1 to confirm clinical diagnoses of ASD, to describe our sample, and to measure autism symptomotology. Due to our selection criteria for the study, the ADOS Module 1, originally developed as the Prelinguistic Autism Diagnostic Observation Schedule (DiLavore et al. 1995), was used for all participants. Revised scoring algorithms (Gotham et al. 2007) were used as cut-offs for diagnoses of “autistic disorder” or “autism spectrum disorder.” The social communication algorithm total score was used as the measure of severity of autism symptoms (Lord et al. 2000).

Mullen Scales of Early Learning (MSEL; Mullen 1995)

This instrument was given at Time 1 to obtain mental ages and to derive developmental ratios (i.e., mental age/chronological age). The mental age was the average age equivalency score from four subscales: Visual Reception, Fine Motor, Receptive Language, and Expressive Language. The developmental ratio, rather than the standard score, was used as the index of cognitive impairment because the majority of participants had the lowest possible standard score of 49, thus curtailing the variability in cognitive levels reflected by standard scores. The standard score across the aforementioned four subscales is called Early Learning Composite (ELC) and is provided for descriptive purposes in Table 1.

MacArthur Communicative Development Inventory-Words and Gestures Form (MCDI; Fenson et al. 2007)

This MCDI form is a checklist designed to measure receptive and expressive vocabulary size in young children. The parents were asked to fill out the comprehension and production columns of the vocabulary checklist at Times 1–5. Raw scores were used as receptive and expressive language measures.

Communication and Symbolic Behavior Scales-Developmental Profile Behavior Sample (CSBS; Wetherby and Prizant 2002)

The CSBS was given at all 5 measurement periods. This standardized, structured communication sample may be used to assess the communicative competence (use of eye gaze, gestures, sounds, words, understanding, and play) of children with a functional communication age between 6 and 24 months (chronological age approximately 6 months to 6 years). For this study, we used the weighted raw scores for the following scales: (a) Scale 3 (attention following) at Time 1; (b) Scales 4–7 (intentional communication) at Time 1; (c) Scale 11 (consonant inventory) at Time 1; Scales 12–15 (word use) at Times 1–5; (d) Scale 16 (comprehension) at Times 1–5, and (e) Scale 17 (object play) at Time 1. These scores were derived according to the instructions in the administration manual, but were scored from media files rather than being scored in situ.

Unstructured Communication Sample (UCS)

The UCS is a communication sample that was used to derive a more naturalistic measure of communicative word use than is available through the CSBS or the ESCS. This 15-min unstructured conversational language sample involves playing with developmentally appropriate toys. To reduce measurement error, it uses the same toys and adult-interaction style, wherein topic-following comments and questions are encouraged and redirectives are discouraged, across all children.

Early Social Communication Scales (ESCS; Mundy et al. 2003 )

The ESCS was given at Time 1. It was used in addition to the CSBS to increase the number of sampling opportunities for measurement of responding to joint attention and intentional communication. The ESCS takes 15–25 min to complete. It includes a variety of situations designed to motivate young children to respond to attentional cues of others or to communicate in order to regulate the behavior of another person, to socially interact with another person, or to direct the other person’s attention to an object or event. The examiner makes eight attempts to direct the child’s attention to posters hanging on the wall by telling the child to “look” and simultaneously pointing an index finger to one of the posters.

Parent–Child Free Play (PCFP)

This procedure was administered at Time 2. The child and parent engaged in a 15-min unstructured interaction session. The adult was asked to, “play as you would at home if you had no interruptions and had time to play with your child.” A standard set of developmentally appropriate toys was made available to the parent–child dyad. This context served as a measurement context for parent linguistic responses.

Parent–Child Snack Session (PCS)

This procedure was administered at Time 2 as an additional sample for measurement of parent linguistic responses. Snack time provides a familiar activity in which child requests are frequently elicited, thereby providing increased opportunities for parental responses. The procedure lasted 10 min. The parent and child were seated at a table. The parent was told, “We want to see how your child communicates during snack times. Just interact with him as you would at home if you wanted to elicit his communication.” The parent was provided with a 4 oz. cup, a pitcher of juice, and several single-bite cookies, crackers, or parent-provided snack. Child preferences were assessed by phone prior to scheduling the procedure.

Attention During Child-Directed Speech (ACDS; Watson et al. 2010)

This procedure was given at Time 1. For this assessment, the child is seated at a table facing a puppet theater that contains a window in which all stimuli are presented. The child first saw a 3-min presentation of a music video, which in previous research proved helpful in attracting the interest of young children with ASD (Watson et al. 2010, 2012). After a 10-s break, three 1-min child-directed speech vignettes were presented: (a) a video of a woman reading a children’s picture book, (b) a brief live puppet show delivered by a research assistant, and (c) a video of a woman playing with and describing a novel toy. All speakers were adult females, who use vocal intensity, frequency, pitch and duration consistent with characteristics of natural child-directed speech. A small, unobtrusive video camera mounted just below the theater window was used to capture a clear image of the child’s face and eyes when s/he was oriented toward the theater window. Coding of these videos is described under Procedures.

Motor Imitation Scale (MIS; Stone et al. 1997)

The MIS was given at Time 1. This measure was developed to assess immediate motor imitation in young children with autism. It consists of 16 items involving single-step motor imitation acts, half involving body movements only and half involving actions with objects. Each item is scored in situ as 0, 1, or 2 on the basis of the quality and accuracy of the imitation. Adequate internal consistency (.88) and 2-week test–retest reliability (.80) were found for the total imitation score in a sample of 30 2-year olds with diagnoses of autism spectrum disorder, developmental delay, or language disorder. The MIS total score was used as one of our measures of motor ability.

Developmental Play Assessment (DPA, Lifter 2001 )

The DPA was given at Time 1. It was designed for use with children who have a range of disabilities, including autism. The procedures include collection of a video-recorded, 10-min sample of the child engaging in unstructured play with two groups of toys in the presence of an examiner. The examiner does not direct the play, but comments descriptively on what the child does with the toys. This behavior sample was used to code the number of different play acts.

Oral Motor Examination (OME)

The OME was modified from Amato and Slavin (1998) for the purposes of this study. The adaptation included 18 items, divided into two subtests: (a) the Eating Behaviors Scale, comprising seven items, such as tongue lateralization, mandibular stability, and sucking, that assess spontaneous oral motor skills while eating a snack; and (b) the Nonverbal Volitional Oral Abilities Scale (NVOA), comprising 11 items, such as tongue lateralization, blowing, and puckering lips, that assess oral motor imitation. Eating behaviors were scored as 0 or 1, whereas the NVOA items were scored as 0, 1, or 2. Higher scores reflect better skills. No demonstration was provided during the Eating Behaviors Scale; participants simply ate and drank as they normally would. The NVOA Scale required the participant to imitate oral motor movements demonstrated by the examiner. Both subscales included assessment of motor behaviors of the tongue, lips and jaw. Children’s performance was scored in situ. In previous research, the raw score on each non-adapted subscale discriminated verbal versus nonverbal children with autism between the ages of 2.5 and 4 years (Amato and Slavin 1998). Raw scores from each adapted subtest were the metrics used in the analyses for this project.

Coding

Variables derived from the CSBS, PCFP, PCS, ESCS, DPA, UCS, and ACDS were coded from media files of these procedures. The UCS, ESCS, PCFP and PCS were coded using Procoder DV (Tapp 2003), and the ACDS was coded using Observer XT (http://www.noldus.com/human-behavior-research/products/the-observer-xt). For the DPA, custom software was developed with an extensive list of actions that a child might perform with each of the objects in the two toys sets (Tapp and Yoder 2000). For all assessments that were coded or scored from media files, one of every five videos was randomly chosen for interobserver reliability. Discrepancies in scoring or coding were discussed to prevent observer drift. Coding manuals can be requested from the first author.

Parent linguistic responses were coded from the PCFP and PCS. For both procedures, behaviors were coded within 5-s intervals. For the PCFP, coders identified “codable” intervals (i.e., intervals in which the child and parent were visible on the screen) during a first pass using a momentary interval coding method. On a second pass, coders used a partial interval coding method to mark each codable interval that had at least one child attention lead (i.e., child touched or looked at object). On a third pass and using a partial interval coding method, parent linguistic responses to child attention leads (parent talking about the object referenced by the child lead, the action referenced by the child lead, or both) were coded. The PCS was coded similarly with two exceptions. In addition to child attention leads, child communication leads (see below for definition of intentional communication) were coded in the second pass and adult linguistic responses to either type of child lead were coded in the third pass.

The UCS was coded for communicative word use by the child using a timed-event behavior sampling method. Coders completed a first pass to mark the sections of the media file that were codable. On a second pass, coders marked children’s intentional communication acts, defined as: (a) nonconventional gestures, non-word vocalizations, or imitative symbols (signs or words) that occurred with coordinated attention to an object and an adult; (b) conventional gestures with attention to the adult; or (c) spoken words and American Sign Language approximations. Spoken words used as part of intentional communication acts were transcribed, and the number of different word roots the child used during the UCS were counted (e.g., “car” and “cars” share the same word root, so would only be counted once) using Systematic Analysis of Language Transcripts (Miller and Chapman 1990).

The ESCS was coded using event behavior sampling and a paper and pencil method of recording decisions. The number of intentional communication acts (regardless of pragmatic function) and the number of responses to joint attention bids were coded. For this sample, communication acts, defined in accordance with the ESCS manual, included child gestures, vocalizations, and/or verbalizations that were directed to an adult in the room and that served an identifiable communicative function (i.e., requesting, initiating joint attention, or social interaction). For response to joint attention trials, the child was given credit if s/he looked in the direction of the adult’s attention directive.

The DPA was coded using a timed-event behavior sampling method to record when a predetermined type of action with an object was used. Single instances of any of these predetermined actions were sufficient to count as different play acts, as long as the action was differentiated according to the affordances of the object (i.e., not banging, shaking, mouthing, or atypical examining).

The ACDS media files were coded using a timed-event behavior sampling method. Observation time was coded as either: (a) “child looking” at the CDS stimuli presented in the puppet theater window, or (b) “child not looking.” Because of slight variations in the length of one of the child-directed speech vignettes (the live puppet show), this variable was computed as a proportional score (# of seconds child looked at child-directed speech stimuli/total # of seconds of child-directed speech vignettes).

Reliability

Interobserver reliability was documented by independently coding a random sample of 20 % of all coded sessions from media files. The primary coder was blind to which sessions would be checked for reliability. The intraclass correlation coefficients (ICCs) computed using a two-way random model for absolute agreement for the variables included in the present analyses by procedure are as follows: (a) CSBS attention following at Time 1 (.95); (b) CSBS play at Time 1 (.98); (c) CSBS consonant inventory (.98); (d) CSBS comprehension subscale Times 1–5 (M = .99, SD = .08); (e) CSBS word subscale Times 1–5 (M = .94, SD = .04); (f) ESCS response to joint attention at Time 2 (.99); (g) ESCS intentional communication at Time 2 (.97); (h) DPA number of different play actions at Time 1 (.98); (i) PCS number of linguistic responses at Time 2 (.98); (j) PCFP number of linguistic responses at Time 2 (.98); (k) UCS number of different word roots at Times 1–5 (M = .98, SD = .008); and (l) ACDS proportion of time looking at CDS vignette at Time 1 (.99).

Data Analysis Decisions

Rationale for Using Individual Growth Curve Modeling and Related Decisions

An application of mixed level modeling in which measurement periods are clustered within participant (i.e., growth curve modeling) was used to test the research questions. When five or more measurement periods are used, such models have the potential to produce more precise estimates of change than do alternative ways of quantifying change (Maxwell 1998). The intercept of growth curves can be interpreted as an estimate of language level at the measurement period at which Time in Study is centered. When growth curve parameters were used as dependent variables, Time in Study was centered at Time 5 so the intercept would be interpretable as Time 5 language level. The unconditional growth model used to test the research questions was the most parsimonious model that best explained variance in language growth.

Rationale for Using Aggregate Measures of Constructs

In participants in the early stages of learning or developing a skill, past work has indicated that averaging scores from multiple valid measures of a construct produces variables that are more stable than those produced by a single measure (Sandbank and Yoder 2014). Additionally, including separate, but multiple, measures of each predictor construct would have exceeded the recommended ratio of predictors to participants, which in turn could have produced unstable estimates of the associations among predictors and between predictors and language growth. Reducing the number of predictors by aggregating those measuring the same construct is one way to reduce the number of predictors without undue loss of information. Finally, direct observation and parent report measures of child language have complementary advantages for measuring useful speech. For example, direct observation has the advantage of allowing judgments of referential use, conventionality, and frequency of occurrence, but has the disadvantage of limited observation opportunities. Parent report has the advantage of drawing from a wide range of experience with the child, but has the disadvantage of having unknown adherence to the referential, conventional, and frequency criteria of useful speech. Thus, aggregating across direct observation and parent report measures was thought to produce a more valid measure of language than component variables. Aggregate measures were used only when empirical findings justified the aggregation.

Criteria and Procedures Used to Aggregate Measures

To decide whether to use single measures or aggregates of multiple measures of the same construct, component variables with content validity for measuring the same construct had to intercorrelate at >.39. Component variables were aggregated by averaging z-transformed component scores. Z-score transformation allows combining of component measures with different metrics and equal weighting of scores from procedures with differing opportunities for key behaviors. Equal weighting is recommended when the basis for differential weighting has not been replicated or does not have strong theoretical rationale (Kerby 2003). To transform component variables for aggregate predictors, the mean and SD in the z-score transformation formula (z = [mean − x]/SD) were obtained from the period at which the component variable was measured. Because the average language score was expected to increase with time, using each measurement period’s mean to compute each period’s z score would have resulted in aggregates that would be insensitive to real change. Using a single period’s mean and SD for all period’s z-transformed scores results in scores that can show change. Selecting the Time 5 mean and SD enabled estimating deviance from the end-period’s average in SD units, a meaningful reference point. Thus, to transform component variables for aggregate measures of language growth, Time 5 means and Time 5 SDs were used so that variation in growth on the aggregate language variables could be quantified.

Rationale for and Procedures Used to Transform Analyzed Variables

The analysis methods we used assume multivariate normality (Enders 2010; Singer and Willett 2003). Multivariate normality is more likely when univariate distributions do not grossly depart from descriptors of the normal distribution (Tabachnick and Fidell 2001). When analyzed variables (indicated in Table 3) or language growth aggregates had univariate skewness >|.8| or kurtosis >|3.0|, they were transformed. When untransformed scores had negative values, a constant was added before transformation. Transformations were selected in accordance with the principles in Tabachnick and Fidell (2001).

Rationale and Plan to Select the Measure of Receptive Vocabulary When Analyzed as a Putative Predictor

Early receptive vocabulary was expected to be a value-added predictor of both expressive and receptive language growth because (a) it is a common predictor of language growth, and (b) it is has strong theoretical rationale for considering the association to be causal. However, receptive vocabulary is difficult to measure in a valid manner during the nonverbal stage in children with autism (Charman et al. 2003). Thus, we planned to use a different approach to selecting the measure of early receptive vocabulary than was used for measuring other putative predictors. Specifically, we sought to use the intercept of the growth curve for the aggregated receptive vocabulary measure. However, rather than centering Time in Study at Time 5, as we did when considering receptive vocabulary growth as a dependent variable, we centered Time in Study at Time 1 to quantify early receptive vocabulary as a predictor. By doing so, the intercept of the growth curve could be interpreted as an estimate of Time 1 receptive vocabulary level. We refer to this variable as the Time 1 estimated receptive vocabulary. Arguably, values for this parameter represents a more valid estimate of Time 1 receptive vocabulary level than the component or aggregated observed scores (Singer and Willett 2003).

Hierarchical Linear Modeling software was used to attain the ordinary least squares estimates (OLS) of the intercept in the unconditional linear growth of the receptive vocabulary aggregate. These OLS estimates of the intercept were then used as Level 2 predictors in the mixed level models of language growth. As the results will show, we were able to use this strategy to identify value-added predictors of early receptive vocabulary in predicting expressive, but not receptive, language growth. The preliminary results section reviews findings that led to use of a different strategy for indexing initial receptive vocabulary level for predicting receptive vocabulary growth.

Approach to Handling Missing Data

Depending on the variable, putative predictors had between 0 and 33 % missing data. Thus, multiple imputation (MI), a state-of-the-art approach to missing data analysis, was carried out using the PROC MI in SAS. MI produces less biased regression coefficients than more traditional methods of handling missing data (e.g., listwise deletion, last observation carried forward; Enders 2010). Following Enders (2010), all putative predictors and criterion variables were used as sources of information to impute missing values. If transformations were needed, metrics were transformed before imputation (von Hippel 2009). Forty data sets were imputed because past research has indicated that pooled parameters estimated from such a high number of imputed data sets produces less biased results than pooled parameter estimates from fewer imputed data sets (e.g., 5 or 10; Graham 2009). After imputed data sets were created, imputed scores were deleted for the two criterion variables because not doing so has been shown to bias regression estimates (von Hippel 2007). All participants had observed scores for at least one language measure at two or more measurement periods.

Results

Preliminary Results

Change and Status of the Participants at Time 5

Using the criteria for ‘nonverbal/minimally verbal status’ indicated in the sample selection criteria, we computed the proportion of the sample that continued to meet these criteria at 16 months after entry into the study. Forty-percent remained nonverbal/minimally verbal, while 60 % moved exceeded the selection criteria for this status at Time 5. Using the most easily interpretable component variables, the MCDI, the average gain in words said was 75 (SD = 95, d = .84) and the average gain in words understood was 91 (SD = 95, d = .85). Means and SDs are given for two component measures of expressive language at Time 5 in Table 2.

Creation of Aggregated Dependent Variables for Expressive and Receptive Language

The intercorrelation among the three expressive language variables (MCDI expressive, CSBS word scale, UCS number of different words) ranged from .70 to .82. The intercorrelation between the two receptive vocabulary variables (MCDI receptive and CSBS comprehension scale) was .61. Thus, each set of component language variables was aggregated to create an expressive language aggregate and a receptive language aggregate.

Selection of the Unconditional Models for Language Growth

Centering Time in Study at Time 5, the model with fixed and random effects for the intercept and slope (i.e., the full simple linear model that considers individual variability in both the rate of growth across time in the study and outcome at the end of the study) was the best fit to the data for both expressive and receptive growth. The unconditional growth model for expressive language was yhat = .29 + .04(time) + 017. The unconditional growth model for receptive language was yhat = .29 + .01(time) + .006. In both cases, data met the assumption of homoscedasticity. More complex models (e.g., random and fixed quadratic parameters added) either did not improve the fit to the data, or many imputed data sets did not converge. The reliabilities of the intercept and slope for expressive language growth in the selected full simple linear model were .95 and .89, respectively. The reliabilities for intercept and slope for receptive vocabulary growth in the selected full simple linear model were .87 and .65, respectively.

Findings Related to Selection of Analyzed Predictors of Language Growth

The analyzed predictor and background variables are indicated in Table 3 along with their corresponding constructs, procedures, component variables, and roles in the analyses. Below we discuss the findings relevant to deciding whether constructs were measured by single measures versus aggregates.

Intercorrelation Among Component Measures of Predictor Constructs

The component variables aggregated for each predictor construct are indicated in Table 3. The component variables for play, RJA, intentional communication, motor imitation, and parental linguistic responses were correlated within construct at .60, .60, .45, .55, and .40, respectively. Another measure of intentional communication, the Communication subscale of the CSBS (Scales 4–7) was insufficiently related to the other two measures of intentional communication to warrant inclusion in the aggregate (r < .36). Thus, the Communication subscale of the CSBS was not analyzed as a predictor of language growth.

Empirical Rationale for Measuring Three Constructs with a Single Measure

Three putative predictor constructs were measured by single measures: ACDS, nonimitative oral motor functioning, and consonant inventory. This decision was made because: (a) the component variables that we had considered conceptually similar to these variables did not demonstrate the empirical associations necessary to warrant aggregation, and (b) no other variables in our test battery were conceptually similar enough to these measures (i.e., showed face validity for measuring the same construct) to warrant their consideration for aggregation. The component variables that we hypothesized would be related to ACDS were the measures of responding to joint attention; however, these variables correlated with ACDS at <.26. The component variables that we thought would be related to nonimitative oral motor functioning were oral motor imitation and consonant inventory; the latter variables correlated with the former at <.27.

Selecting the Measure of Early Receptive Vocabulary as a Putative Predictor of Receptive Vocabulary Growth

Models that included the Time-1 centered intercept in the receptive vocabulary growth curve predicting receptive vocabulary growth did not converge; therefore, another early receptive language predictor of subsequent receptive vocabulary growth was selected using a two-step process. First, the intercorrelation of the three receptive language scores (MCDI Comprehension, CSBS Comprehension, and Mullen Comprehension) at Time 1 was examined. The magnitude of these correlations was insufficient to justify aggregation across these component variables (r = .24 to .39). Second, the added value of each of these three early receptive language measures was tested. Only the MCDI comprehension raw score explained unique variance in the growth of receptive vocabulary level after controlling for the other two early receptive language measures (p < .001). Thus, the MCDI comprehension score was selected as the putative predictor for receptive vocabulary growth in analysis. To distinguish this Time 1 receptive vocabulary measure from the one used to predict expressive language growth (i.e., Time 1-centered intercept of receptive growth), we refer to the putative predictor of receptive growth as parent-reported receptive vocabulary.

Selection Rules for Predictors in Final Statistical Models

The number of subjects constrained the number of putative predictors that could be included in the model to five per growth parameter because a high ratio of predictors to participants may prevent models from converging. The following process was used to systematically reduce the number of predictors in the expressive and receptive language models, while still retaining those predictors with value added (see Fig. 1).

Fig. 1
figure 1

Example of process used to reduce the number of putative predictors

  1. 1.

    Starting with the full set of putative predictors in Table 3, we identified those with significant zero-order (i.e., with no other putative predictors statistically controlled) associations with at least one of the language growth curve parameters (i.e., intercept or slope). Results of these analyses are detailed in Table 4.

    Table 4 Results of tests for zero-order correlates with language growth
  2. 2.

    Of the set of significant zero-order predictors of language growth or outcome, highly intercorrelated (r > .39) predictor variables were examined in the same model to remove those without added value. Intercorrelations among predictors are detailed in Table 5.

    Table 5 Pearson’s r values among significant zero-order predictors of language
  3. 3.

    Analogous to step 2, highly-intercorrelated, significant zero-order background variables that predicted language growth or outcome were examined in the same model to remove those without added value.

  4. 4.

    Predictors surviving Steps 2 and 3 were entered into the model with zero-order predictors of language growth that had low (r < .40) intercorrelation with other zero-order predictors. Step 4 yielded the associations between predictors and criterion variables after controlling for all other predictors in the model and thus determined which predictors show evidence of adding value in predicting expressive and receptive language growth and outcomes.

  5. 5.

    An analogous process is used to identify value added background variables. The surviving background variable, if any, is added to the surviving putative predictor list to identify value-added variables when both putative predictors and background variables are considered.

Identifying Value-Added Predictors of Expressive and Receptive Language Growth

The model testing the added value of zero-order predictors for expressive language growth included: (a) intentional communication, (b) consonant inventory, (c) responding to joint attention, (d) parent linguistic responses, and (e) estimated receptive vocabulary. Findings from the above model resulted in dropping estimated receptive vocabulary as a predictor. When the selected background variable, cognitive impairment, was added to the model, cognitive impairment did not have added value. Therefore, the final model for expressive growth included: (a) intentional communication, (b) consonant inventory, (c) responding to joint attention, and (d) parent linguistic responses (see Table 6).

Table 6 Value-added predictors of language growth, statistical tests, and effect sizes

The model testing the added value of zero-order predictors of receptive vocabulary growth included: (a) responding to joint attention, (b) parent-reported receptive vocabulary, (c) intentional communication, (d) ACDS, and (e) parent linguistic responses. Findings of this model resulted in dropping ACDS. Adding the selected background variable, autism symptomotology, did not result in dropping any more predictors. Therefore, the final model for receptive growth included: (a) responding to joint attention, (b) parent-reported receptive vocabulary, (c) intentional communication, (d) parent linguistic responses, and (e) autism symptomotology (see Table 6).

All predictors had positive, statistically significant associations with the parameter of language growth indicated. Autism symptomotology was reversed scored (i.e., reflected) for analysis. The effect sizes for individual predictors ranged from small to moderate (e.g., responding to joint attention predicting Time 5 expressive language) to very large (e.g., early parent-reported receptive vocabulary predicting Time 5 receptive vocabulary). For both expressive and receptive language, intentional communication, parent linguistic responses, and responding to joint attention had added value in predicting language growth. Consonant inventory had added value in explaining variance in expressive language growth. Parent-reported receptive vocabulary and autism symptomotology had added value in explaining variance in receptive vocabulary growth. Pseudo R squares for the total model predicting individual differences in Time 5 expressive and receptive language level (i.e., intercept) were .50 and .57, respectively. The pseudo R squares for the total model predicting individual differences in average expressive and receptive growth rate (i.e., slope) were .49 and .26, respectively.

Discussion

By virtue of the selection criteria for putative predictors in the current study, all of our identified predictors with added value for predicting expressive and receptive language growth in initially nonverbal children with ASD have already been replicated in the extant literature. The contribution of the current study lies in our determination of which variables “drop out” of the models after controlling for intercorrelation with other predictors. Few previous studies could analyze predictors of expressive and receptive language growth of preschoolers with ASD in this way due to a more limited set of predictors or a smaller sample size.

For both modalities (receptive and expressive), the putative predictors that did not have added value were: (a) object play, (b) ACDS, (c) motor imitation, and (d) nonimitative oral motor functioning. Additionally, cognitive impairment dropped out of the model predicting growth for both modalities of language. The simplest explanation for variables dropping out is that they had high intercorrelation with one or more other predictors of language growth. Given their high intercorrelation, this is the most probable explanation for why object play, motor imitation, attention to child-directed speech, and cognitive impairment dropped out of predictive models. However, nonimitative oral motor functioning simply had nonsignificant zero-order associations with language. Although one interpretation is that nonimitative oral motor functioning does not explain variability in expressive language outcomes for young children with ASD as a group, it also is possible that our measure of oral motor functioning was insufficiently sensitive to the aspects on oral motor functioning that are predictive of expressive language. Similarly, our measure of autism symptomotology (i.e., the social–communication algorithm score from the ADOS module 1) may not have been sufficiently sensitive to survive the winnowing process for expressive language.

Surprisingly, early receptive vocabulary did not continue to predict expressive language growth after controlling for the other predictors of expressive language. In past work controlling for only a few predictors, early receptive language had an exceptionally strong relation with later expressive language in children with ASD, despite receptive language being measured by only a few items on a single instrument (Wetherby et al. 2007). In the current study, we controlled for many predictors, and early receptive vocabulary dropped out. Presumably, the intercorrelation of early receptive vocabulary with multiple predictors caused it to drop out of the current study’s model predicting expressive growth.

Prior work indicates that most of the variables identified as having added value for prediction of spoken language in preschoolers with ASD are modifiable through treatment (i.e., malleable). Parent linguistic responses are a dyadic, not solely parental variable. One can treat this variable by increasing the number of child leads and by increasing the consistency of parental responsivity to these leads. For both of these potential treatment targets, we have replicated evidence that intervention can affect generalized outcomes in dyads with children with ASD (Kasari et al. 2006; Mahoney and Perales 2005; Siller et al. 2013). For intentional communication, there is evidence that treatments can facilitate highly generalized use of intentional communication in initially nonverbal or minimally verbal children with ASD, even for the most socially-motivated pragmatic functions (Ingersoll and Wainer 2013; Yoder et al. 2014). For early receptive language, there is replicated evidence that treatments can influence highly generalized changes in children with ASD (Dawson et al. 2010; Fava et al. 2011). For RJA, there is only one study with high internal validity that shows treatments can affect generalized changes in children with ASD (Kasari et al. 2006). Thus, replication of treatment effects on highly generalized RJA is needed. No published studies have indicated that comprehensive early intervention improve broader social-communication autism symptomatology in preschoolers with ASD as measured by diagnostic instruments such as the ADOS. Additionally, to our knowledge, we have no evidence that treatment can affect highly generalized changes in consonant inventory among initially nonverbal or minimally-verbal children with ASD. Thus, there is a clear need for further research in this area.

Theories that Value-Added Predictors Represent

Multiple theories could be invoked to explain why the identified predictors have added value in predicting language growth in children with ASD (see Table 1). It is not our intent to promote one theory over another. In fact, the conceptual overlap between “competing” theories makes it difficult, if not impossible, to produce sufficient divergent evidence to select one theory versus the other. For example, intentional communication and RJA have been discussed in cognitive and social theories of language variation in children with ASD (Chawarska et al. 2012; Clifford and Dissanayake 2009). Consonant inventory is listed in one instantiation of a variant of cognitive theory (Shriberg et al. 2011) and in motor theories (Adams 1998; Gernsbacher et al. 2008). In contrast, parental linguistic responses are clearly dyadic, as children must provide the leads for parents to provide the linguistic response. As such, it is a prototypical example of a predictor that transactional theories of language variation would promote (Bottema-Beutel et al. 2014). Identifying early receptive vocabulary as a value-added predictor of later receptive vocabulary growth underscores the importance of early representational abilities.

Future Research Needs

This study highlights the need for further research in several areas. For example, we know little about the development and treatment of consonant inventory growth in nonverbal children with ASD. This deficit in our knowledge is salient because consonant inventory had added value for predicting expressive language growth despite it being the only theoretically-motivated predictor that was quantified by a single variable. Additionally, the surprising absence of early receptive vocabulary in the list of value-added predictors of expressive growth calls attention to the relative paucity of research on the directionality of longitudinal associations between expressive and receptive language in initially nonverbal or minimally-verbal children with ASD. Future investigations may advance our understanding in each of these areas and contribute to clinical decisions on whether to specifically target receptive language in language intervention programs for nonverbal or minimally verbal children with ASD, and if so, what sequence of intervention targets (i.e., receptive first or expressive first) is optimal for promoting gains in this population.

Weaknesses of the Study

The above must be considered in the context of the study’s weaknesses. First, although this particular longitudinal correlational study ruled out many alternative explanations for the associations of interest, no correlational design can rule out all covarying variables that might explain the associations between our predictors and language growth in preschool children with ASD (i.e., the third variable problem still exists). One can only rule out these third variable explanations by experimentally manipulating levels of a predictor while controlling for potentially correlating variables.

Additionally, our sample size, though quite large relative to many previous studies examining predictors of spoken language in ASD, was still insufficient to afford entering all 11 putative predictors into the same statistical models of receptive and expressive language growth. Doing so would have simplified the process of identifying predictors with added value for explaining individual differences in expressive and receptive language growth in our sample of preschoolers with ASD. Because we did not have sufficient power to implement this simpler statistical analysis plan, we used an empirically-guided approach to identifying predictors with value-added validity. However, using such empirically-guided methods can result in sample-specific results because sample variances in predictors and criterion variables can influence the size of intercorrelation among predictors and language growth. An empirically-guided approach to selecting predictors was viewed as appropriate in this study because there was insufficient theoretical or empirical guidance on the most useful order of predictor entry into the model. Future replication will improve our confidence in the variables identified as having added value for explaining population variability in language growth in initially nonverbal or minimally-verbal children with ASD.

Third, there is a shorter interval between the period at which linguistic responses is measured and the end-point (12 months) than for some of the other predictors (16 months). Although the timing of administering the parent–child sessions was justifiable given the large number of other procedures already administered at Time 1, it is possible that linguistic responses had unfair advantage relative to predictors that were measured only at Time 1.

Finally, like any study seeking to test explanatory models of language growth, only a subset of all possible predictors was selected for study. We selected predictors that were both theoretically supported and empirically demonstrated as predictors of later language in at least two prior studies of children with ASD. Novel, but potentially more explanatory, variables were possibly excluded. Future research may identify variables that further improve our predictions of language growth for children with ASD.

Strengths of the Study

Despite these weaknesses, several strengths are readily identifiable for this study. First, the longitudinal correlational design affords more convincing explanatory models of variability in language development than concurrent correlational designs. Variables that were proposed to potentially influence language were measured prior to the measurement of language outcomes at Time 5. Time 5 measurement of expressive and receptive language outcomes occurred 12–16 months after the measurement of the predictors. This relatively long interval between measurement of predictors and the language end points allows sufficient time for much variability in language development to occur and is a clinically useful interval over which variability was predicted.

Additionally, when empirically and theoretically justified, multiple measures were combined to quantify individual differences on predictor and criterion constructs. Fortunately, purported measures of the same construct were sufficiently intercorrelated to justify producing aggregate measures of most constructs. Importantly, one of these constructs was language. Aggregating across multiple measures yields more stable estimates of individual differences among participants in the early stages of skill development than single measures of constructs (Sandbank and Yoder 2014) and improves our confidence that results are not overly influenced by a single measurement approach or instrument.

Furthermore, the relatively large sample size in the current study afforded one of the best tests of added value in predicting language growth in initially nonverbal or minimally-verbal preschoolers with ASD to date. Having 87 participants in the study sample allowed us to statistically control for the intercorrelation among several putative predictors when testing their added value for explaining variance in language development. The large number of previous studies involving smaller samples of children with ASD provided sound rationale for selecting our nine theoretically important predictors and two background predictors of language as putative predictors.

Finally, the analysis methods used in this study were among the best available. Using MI to handle missing data allowed us to reduce bias in estimating the coefficients for the association between predictors and language growth relative to list-wise or pair-wise deletion of participants (Enders 2010). Growth curve analysis allowed more precise estimation of end-points, beginning-points, and growth rates for expressive and receptive language in our sample than analytic methods that rely on single measurement periods.

Conclusion

To our knowledge, this is the most comprehensive test of the added value of early predictors of expressive and receptive language growth in initially nonverbal or minimally-verbal preschoolers with ASD. It is possibly unnecessary and inefficient to target all predictors of later language as therapeutic goals. Empirically-grounded clinical decisions on which goals to address can be informed by models identifying the minimal number of predictors that add value in accounting for variation in language growth. This study provides such information.