Introduction

In recent years, the number of individuals with autism and other developmental disabilities has increased dramatically. Recent estimates indicate that nearly 1.5% of the population of 8-year old children currently carry a diagnosis of autism (1 in 68 individuals; Christensen et al. 2016) representing a 30% increase in the number of diagnoses since 2008. In addition, the number of individuals with developmental disabilities is even higher with 15% of the population of children between 3 and 17 being diagnosed with a developmental disability (1 in 6 individuals; Boyle et al. 2011). As a result of this increase in diagnoses, there is an increased demand for treatment to help address the special needs of these individuals.

One of the most important forms of treatment for individuals with autism lies in special education. Estimates from multiple US states indicate that as the number of individuals diagnosed with autism increases, so do the number of individuals who receive special education services (Loiacono and Allen 2008; Newschaffer et al. 2005; Pinborough-Zimmerman et al. 2012). Given the diversity of symptoms that individuals with autism experience, it is vital that their education is developed in order to best meet each individual’s specific needs. The process of designing individualized treatment involves both accurate assessment and effective and efficient intervention. Throughout this process it is not only ethically imperative to provide service which are research-based and effective, it is also legally mandated (Individuals With Disabilities Education Act 2004).

The vast majority of research-based intervention for individuals with autism originates from behavioral interventions. The National Autism Center (NAC) recently released a summary of interventions currently available for individuals with autism (2015). In their review, the NAC looked at which interventions had several published, peer-reviewed articles (Established Interventions), which had few published, peer-reviewed articles (Emerging Interventions), and which may or may not be based on research (Unestablished Interventions). Within the category of Established Interventions for individuals with autism, behavioral interventions (i.e., antecedent and consequent interventions) comprised the largest subset of interventions. In addition, the empirical support for the effectiveness of behavioral interventions was nearly 6 times the amount of support that the next most supported Established Intervention had. Some of the initial support for behavioral interventions or interventions rooted in applied behavior analysis (ABA) was begun decades ago with perhaps one of the most well-known studies being that of Lovaas (1987) who demonstrated that early intensive behavioral intervention resulted in increased scores on IQ tests for students. However, in the near three decades between Lovaas’ study and now, research supporting the use of ABA as a treatment for autism has continued to grow (see Rosenwasser and Axelrod 2001 and Rosenwasser and Axelrod 2002 for a 2 part review; See also, Roane et al. 2016) and emphasize the need for disseminative, research-based behavioral intervention.

Within the field of ABA, many resources have surfaced in recent years in an attempt to better disseminate ABA techniques to the general public. Despite the fact that the number of board certified behavior analysts working in the community is rising, the individuals providing care for children with autism the majority of the time are not trained behavior analytic professionals (Loiacono and Allen 2008). Many children spend the majority of their time at home with their parents providing the primary care and are taught by teachers or teacher’s aides during the school day. As such, it is important that packaged ABA interventions not only incorporate ABA methodologies, but are also effective in their own right as assessment and treatment curricula which can be implemented be individuals without explicit behavior analytic training. In recent years, several attempts have been made to provide resources to serve this purpose. These resources encompass a wide range of approaches including books detailing the methods of ABA, assessments of skills, and premade curricula to attempt to train new skills however very few of these packaged interventions have any published, peer-reviewed data attesting to their reliability, validity, or effectiveness (Dixon et al. under review).

Evaluating all three aspects (reliability, validity, and effectiveness) of these protocols is important. It is vital that we, as a scientific field, are able to demonstrate that the tools we provide to practitioners are effective as they are currently designed, reliably produce positive results when implemented, and actually teach the skills they purport to. Though previous research has reliably shown that ABA methodologies are effective in the treatment of individuals with autism, further investigation of individual protocols is necessary to establish their individual value as assessment and curriculum tools including conductance of various forms of assessment such as test-retest reliability, randomized controlled trials, and psychometric analysis.

One recently developed assessment rooted in ABA methodology which has begun the process of empirically evaluating a packaged protocol is the Promoting the Emergence of Advanced Knowledge Relational Training System (Dixon et al. under review). The PEAK consists of four modules designed to specifically assess and instruct skills in four different areas: The Direct Training Module (PEAK-DT; Dixon 2014a), which focuses on traditional discrete trial training procedures to teach specific skills; The Generalization Module (PEAK-G; Dixon 2014b), which focuses on both training and testing skills to ensure that generalization of learned skills has been achieved; The Equivalence Module (PEAK-E; Dixon 2015), which focuses on both training and testing skills to assess an individual’s ability to exhibit derived relational responding in the form of equivalence; and The Transformation Module (PEAK-T; in press), which focuses on both training and testing skills to assess an individual’s ability to exhibit derived relational responding across multiple relational frames. Each module contains an introduction which reviews some basic principles of incorporating ABA methodologies, as well as details for how to teach using ABA. In addition, each module also consists of 184 individual programs with explicit testing and training instructions for teachers or other caregivers to reference which compromise the assessment and curriculum sections of the modules.

Over recent years, several empirical investigations have been published regarding the PEAK’s reliability, validity, and effectiveness. The most researched module is the first module which was released: The PEAK-DT Module (Dixon 2014a). This research has found that the PEAK-DT Module has high interobserver reliability (Dixon et al. 2014a, b, c, d; Dixon et al. 2016; Dixon et al. 2014a, b, c, d, 2016; McKeel et al. 2015a, b, c; Rowsey et al. 2015), test-retest reliability (Dixon et al. 2016), strong convergent validity with measures of language (Dixon et al. 2014a, b, c, d; McKeel et al. 2015a, b, c), intelligence (Dixon et al. 2014a, b, c, d), other ABA based assessment tools - the Verbal Behavior Milestones Assessment and Placement Program (VB-MAPP; Sundberg 2008) and the PEAK-G (Dixon et al. 2014a, b, c, d), effectiveness as a curriculum for training novel language skills (McKeel et al. 2015a, b, c), high efficacy as a training tool per a randomized controlled trial study (McKeel et al. 2015a, b, c), as well as an initial investigation into a normative sample’s performance on the PEAK-DT Assessment (Dixon et al. 2014a, b, c, d).

In addition to the previously mentioned types of reliability, validity, and effectiveness that have been measured in the PEAK-DT, further analysis was completed on its content validity. In this case, content validity refers to the ability of the assessment to identify response classes which underlie larger classes of skills. That is to say, by analyzing which skills within the assessment are interrelated, smaller subgroups of skills may be identified. Although the previous research indicated that the PEAK-DT module is reliable and effective, research on the underlying structure of the module helps to elucidate how certain skills develop in relation to one another. Rowsey et al. (2015) investigated these relationships using a principal component analysis (PCA). PCAs are generally used to break down assessments which measure larger overall “constructs” (E.G., skill repertoires, verbal ability, etc.) into smaller components (E.G., specific sub-categeories of skills [I.E., daily living skills might be broken down into self-cleansing skills, chores, etc.; “Intelligence” might be broken down into communication skills, self-help skills, listener skills, etc.]). From a behavior analytic standpoint, we might define “overarching constructs” as colloquially defined response classes. As such, the need to identify sub-categories (I.E., to begin to narrow down and eventually operationally define our targets) becomes more clear. By using procedures such as PCA, we begin to see which sets of response classes may emerge together and which sets of response classes are seem to occur independent of one another. Rowsey et al. (2015) found that the 184 skills within the PEAK-DT Module can be further broken down into four components. These components represented groupings of skills which may develop together, providing practitioners with an initial understanding of when to begin training specific skills dependent on the learner’s existing repertoire. While PCAs are not common in ABA research, they are widely used by psychologists and nearly all other scientific disciplines (Abdi and Williams 2010). Through the use of PCAs, a large set of measured variables can be reduced to a smaller set of constructs (Fabrigar and Wegener 2011) allowing for assessment of how various skills may relate to one another.

Though the aforementioned line of research represents a first step towards empirically validating the PEAK-DT module as a reliable, valid, and effective assessment and curriculum tool, more research is required to demonstrate the same traits with the PEAK-G and other PEAK modules. Just as it is important to evaluate each individual protocol rooted in ABA to determine its effectiveness in its own right, each module of the PEAK is a standalone assessment and curriculum tool and must be assessed individually. As such, the purpose of the current study was to investigate the underlying structure of the PEAK-G module through a PCA with 84 children with diagnoses of autism.

Methods

Participants and Setting

The participants in the current study consisted of 84 individuals diagnosed with an autism spectrum disorder between the ages of 5 and 21 (M = 12.51, SD = 4.83). The sample consisted of 75 males and 9 females who attended a school for individuals with autism and other disabilities in the Midwest. All assessment sessions and recruitment took place at one of the school’s two locations. Assessments themselves were conducted either in the student’s classroom or in an alternate room within the school to minimize interference with either the assessment procedure or the classroom’s activities. Classrooms consisted of several tables and desks for the students, a desk and chair for the teacher, overhead storage cabinets, and a large metal cabinet where potential reinforcers were kept. For sessions conducted within the classroom, the participant being assessed was moved to a side area of the classroom at table by themselves to minimize disturbance of the rest of the students. Classrooms typically contained a teacher, an aide, and 3–7 students at any given time. Assessments which took place in a separate room from the participant’s classroom were held in a room with no other students which included a table with two chairs and bookshelves containing books and other stimuli. The assessor brought the required stimuli for the assessment in a plastic storage bin which was set off to the side of the room during assessment sessions.

Materials

Participants were assessed using the instructions from the PEAK-G Assessment. The PEAK-G Assessment consists of 184 items targeting learning and language skills. The skills assessed ranged from basic learning readiness skills such as following simple directions, sharing, and waiting to advanced learning skills such as logic, advanced language skills including sarcasm and detecting lies, problem solving, and mathematics. The instructions of the PEAK-G indicate what constitutes a generalized skill (I.E., the ability to identify novel stimuli which would be included in a previously established stimulus class). The PEAK-G Assessment itself contains several pages which list the title of the assessment and the goal for each skill. Each goal explicitly states the response which is required to be demonstrated for the assessor to score that the participant has mastered the associated item. Next to the goal is a section to mark either, “yes,” “no,” or “unknown” regarding the participant’s ability to demonstrate mastery of that skill. Typical items for the assessment included books, items of varying shapes and sizes, pictures of common objects, animals, and people, pencils, paper, etc.

Procedure

Each participant was assessed by a trained graduate student assessor using the PEAK-G Assessment. These assessors worked at the school that the participants attended and were familiar with each of the participants. To minimize any potential interference with the participants’ education or that of their peers, assessments were only conducted during non-instructional time (e.g., during recess or downtime in between classwork). For any participant who was unable to complete the assessment in one session (e.g., the participant began engaging in severe problem behavior or as required to return to class to avoid missing any academic instruction), the session was discontinued and resumed either later on that day or on a separate date. All of the assessments were conducted in one or more sessions which cumulatively totaled between 10 and 120 min to complete dependent on the functioning level of the participant.

Each assessment began with the assessor indirectly scoring the skills which they knew the participant either could or could not demonstrate. Then, any skills which were not scored indirectly were directly assessed by the assessor. For the purposes of the current study, no items were left scored as unknown. If an assessor was uncertain about the ability of the participant to demonstrate the goal for a given item they directly assessed that target to confirm that the participant could or could not demonstrate the skill. Each item on the assessment was scored in order until the participant received a score of “no” for five consecutive items. Once 5 consecutive “no” scores were recorded, all items following that point in the assessment were also scored as “no.” Finally, a total PEAK-G Assessment score was calculated by summing the number of items that the participant was able to demonstrate mastery of (i.e., a “yes” was scored). While no consequences were provided based on correct or incorrect responding during the assessment procedures, participants were offered preferred items or activities prior to the beginning of each assessment session which they would receive contingent upon participation following the completion of the assessment session. In addition, children were provided social praise or small edibles intermittently throughout the assessment session in between assessment trials for continuing to attend to the assessor and follow instructions.

Following completion of the assessment and attainment of the participants’ PEAK-G Assessment scores, the scores for each of the 184 items on the PEAK-G for all participants were compared to assess the relationships among the items in the assessment. Prior to beginning data collection, the methods for the current research were approved by both the Participants’ school’s Institutional Review Board and the Human Subjects Committee at Southern Illinois University, Carbondale.

Data Analysis

Principal Component Analysis

To assess the content validity of the PEAK-G Assessment, the resultant data from the PEAK-G Assessments were analyzed using a principal component analysis (PCA) with a direct Oblimin rotation and Kaiser normalization. This method was chosen because there were no a priori assumptions regarding the underlying structure of the PEAK-G and the primary purpose of the current study was an exploratory analysis of the components underlying the PEAK-G as whole. The direct Oblimin rotation was incorporated as it was reasonable to presume that the skills within the PEAK-G were correlated with one another (i.e., many of the items within the PEAK-G measure similar or presumably related skills) and oblique rotations such as the Oblimin rotation allow for this (Fabrigar and Wegener 2011). The criteria for inclusion for the components within the principal component analysis was that each component should have an eigenvalue greater than 3. Each item was then fitted to a specific component by evaluating the component correlation scores for each item as it related to each component using the component structure matrix. The largest correlation score (negative or positive) indicated the strongest relationship, therefore the component with which the item had the highest correlation was determined to be the appropriate component to assign the item to.

Analysis of Internal Consistency

Once the components within the PEAK-G Assessment were established, Cronbach’s Alphas were conducted on each individual component as well as the PEAK-G Assessment as a whole. High scores on the Cronbach’s Alpha would indicate strong internal consistency within the PEAK-G and its underlying components. A large Cronbach’s Alpha score coupled with significant results from the principal component analysis would therefore indicate that there is unidimentionality among the assessed variables. That is to say a high Crocnbach’s Alpha score indicates that the items being assessed are closely related to one another.

Results

Principal Component Analysis

To assess the content validity of the PEAK-G, a PCA was conducted across all 184 items in the PEAK-G Assessment. The PCA initially yielded 5 components with eigenvalues greater than 3. All programs within the PEAK-G were included in the analysis as the correlation matrix indicated that each item had a correlation with at least one item that was greater than 0.3, and none of the 184 items had commonalities less than 0.3.

Table 1 depicts the initial eigenvalues and sums of squared loadings for each of the initial 5 components. These initial values indicated that the first component accounted for 73.29% of the total variance, the second component for 9.37% of the variance, the third component for 4.16% of the variance, the fourth component for 2.90% of the variance, and the fifth component for 1.85% of the variance. In spite of the fact that each of the five components had eigenvalues greater than 3, analysis of the rotated structure matrix indicated that none of the items from the PEAK-G loaded onto the fifth component (i.e., none of the items was more strongly correlated with the fifth component than one of the first four components). Given that the fifth component accounted for only 1.85% of the variance and the remaining four components accounted for a cumulative 89.71% of the variance, a four component model was accepted as the final model of the PEAK-G Module.

Table 1 Total variance explained

Given the large number (184) of items within the PEAK-G which were assessed within the PCA, the communalities, correlation matrix, and final component loadings have not been presented in the current manuscript, however these data may be obtained by contacting the current authors. Figure 1 displays a visual summary of which items from the PEAK-G loaded onto which components. Overall, the first component contained 33 items, the second component contained 59 items, the third component contained 63 items, and the fourth component contained 29 items. A panel of behavior analysts reviewed the content of each item within the four components and convened to determine appropriate component names to reflect the general content of each. The panel consisted of 5 board certified behavior analysts who all had experience working with children with autism and other developmental disabilities. Based on their suggestions, the first component was named Foundational Learning and Basic Social Skills and included skills such as vocal imitation, prerequisite learning skills (e.g., sharing, waiting, requesting attention), motor skills, motor imitation, and independent play skills. The second component was named Basic Verbal Comprehension, Memory, and Advanced Social Skills and included skills such as chained imitation, intraverbals, sequencing, responding after a delay, asking and responding to questions, and empathic responding. The third component (Component 4 in Tables 1 and 2, and Fig. 1 [Component 3 represented skills deemed to be more complex than Component 4, therefore during the naming of components Component 4 was the third most complex]) was named Advanced Verbal Comprehension, Reading and Writing, and Basic Problem Solving Skills and included skills such as detecting lies, transcription, reading letter and word sounds, beginning math skills, and beginning problem solving skills. The fourth component (Component 3 in Tables 1 and 2, and Fig. 1) was named Verbal Reasoning, Problem Solving, Logic, and Mathematical Skills and included skills such as complex communication (e.g., detecting sarcasm, evaluating rhyme schemes), applied mathematics (e.g., solving problems, using money, telling time, etc.), measurement, and using logic to problem solve. The correlations between each of the four components are displayed in Table 2.

Fig. 1
figure 1

PEAK Generalization Component Loadings. Programs included within each of the four components identified within the PEAK Generalization Module

Table 2 Correlation between each of the four PEAK components

Analysis of Internal Consistency

To test the internal consistency of the PEAK-G Assessment and each of the underlying 4 components, Cronbach’s Alphas were computed across the PEAK-G as a whole as well as each individual component. Table 3 displays the results of Cronbach’s Alphas. Overall, the findings indicate a high degree of internal consistency across the PEAK-G as a whole as well as each of the 4 underlying components.

Table 3 Cronbach’s alpha scores for the PEAK generalization module and underlying components

Discussion

The results of the current study indicate that the PEAK-G represents 4 components, each comprised of interrelated language and learning skills. These findings replicate the previous findings of Rowsey et al. (2015) by demonstrating that the PEAK-G Module is comprised of 4 components as is the PEAK-DT Module, and both modules and their related components demonstrate strong internal consistency. While research on both the Direct Training Module and the Generalization Module of the PEAK have yielded four underlying factors, the content of these factors seems to differ. This is likely due to the fact that the content of the specific assessment and curriculum items for both the PEAK-DT and the PEAK-G differ. While many similar skills are addressed by the two books, there are also skills unique to each module. As of yet, this remains an empirical question worthy of further study. The current findings also extend previous research on the PEAK-G Module by demonstrating a form of validity that has not been previously researched.

While the components described within the current manuscript are derived from empirical data, they are not operationally defined as they might be in a traditional ABA approach. Regardless, there is external validity to incorporating psychometric measures which are rarely used within ABA in that it allows for greater communication with psychological fields outside of the behavior analytic community and provides a description of the skills contained within the PEAK-G that may be more easily understood by the general public. Using these constructs may help practitioners identify how the skills represented by items within the PEAK-G develop, and thus when it is appropriate to begin training them based on the individual learner’s current skill repertoire. That is, as learners begin to master skills, it may be beneficial to target new skills which have been indicated by the current findings to be related (I.E., they are in the same “component”). Due to these skills being related, it may be more likely that the individual will be successful in acquiring the new related skill, though this remains to be empirically demonstrated.

In spite of the positive findings, the current study had several limitations. First, the sampling procedure implemented in this study may impact the generalizability of the findings. Although the attendees of the school where participants were recruited from come from a wide area of the Midwest state in which the school is located, the sample is still limited to that subsection of the overall population of individuals with autism. A convenience sample such as the one used in the current study may lead to a more homogenous sample thereby limiting the extent to which the findings are generalizable to the entire population of individuals with autism. Second, the current study used a relatively small sample size. While textbooks and other literature on principal component and factor analyses frequently state rules of thumb for the number of participants needed for a sample, however these rules are typically based on anecdotal evidence as opposed to theoretical or empirical bases (Fabrigar and Wegener 2011). In addition, the recommended number of participants tends to vary widely and is determined by a large number of variables within the research design. Generally, when variables have high communalities (an average of .70 or higher) and greater than 3 to 5 variables associated with each component, good estimates can be accomplished even with small sample size (Fabrigar and Wegener 2011). The current model reflects these requirements with communalities averaging 0.92 and a minimum of 29 variables loading on each component. Finally, the use of individuals with autism to construct the PCA model in the current study may limit the generalizability of the findings to individuals with other, or no, diagnosed disabilities. It is possible that typically developing individuals and individuals with other disabilities develop their skill repertoires differently over time than do individuals with autism. This remains an empirical question, however the PEAK-G module is designed for implementation with children with autism and other developmental disabilities lending validity to the use of a sample of children with autism.

Future research should address the limitations of the current sample by including more participants from a more demographically varied area. That is, participants from many different locations and of many different demographic backgrounds should be included to further bolster the generalizability of the findings. In addition, future research should compare these results to those of a normalized sample to see if or how the results of typically developing individuals may differ from individuals with autism. Future research should also consider conducting a confirmatory factor analysis to greater bolster support for the current model. Finally, future research should continue to investigate other forms of reliability, validity, and effectiveness for all PEAK Modules as well as other ABA-based protocols designed to assess or instruct individuals with autism or other developmental disabilities.

Altogether, the current findings support a four factor model of the PEAK-G Module and demonstrate strong internal consistency. These findings present another step towards empirical validation of an assessment and curriculum tool for individuals with autism. As the need for treatment continues to grow, it is imperative that the field of ABA be able to provide research based and research validated treatment options to the individuals who deliver that treatment.